{"id":188,"date":"2013-11-12T22:40:48","date_gmt":"2013-11-12T14:40:48","guid":{"rendered":"http:\/\/blog.stlover.org\/?p=188"},"modified":"2013-11-12T23:33:37","modified_gmt":"2013-11-12T15:33:37","slug":"cuda%ef%bc%9a%e8%92%99%e7%89%b9%e5%8d%a1%e6%b4%9b%e4%bb%a5%e5%8f%8a%e9%9a%8f%e6%9c%ba%e6%95%b0%e6%94%af%e6%8c%81","status":"publish","type":"post","link":"http:\/\/blog.xuhao1.me\/?p=188","title":{"rendered":"CUDA\uff1a\u8499\u7279\u5361\u6d1b\u4ee5\u53ca\u968f\u673a\u6570\u652f\u6301"},"content":{"rendered":"<p>\u7531\u4e8e\u968f\u673a\u6570\u662f\u6574\u4e2a\u8ba1\u7b97\u7269\u7406\u7684\u57fa\u77f3\uff0c\u5b9e\u5728\u592a\u8fc7\u4e8e\u91cd\u8981\uff0c\u5e72\u8106\u7528\u4e00\u7247\u6587\u7ae0\u6765\u53d9\u8ff0CUDA\u6709\u5173\u4e8e\u968f\u673a\u6570\u4ea7\u751f\u5668\u7684\u79cd\u79cd\u3002<\/p>\n<p><!--more--><\/p>\n<p><a title=\"CUDA\u5b66\u4e60\u76ee\u5f55\" href=\"http:\/\/blog.stlover.org\/?p=183\">\u76ee\u5f55<\/a><\/p>\n<p>\u8fd9\u4e00\u7bc7\u6587\u7ae0\u6682\u65f6\u66f4\u65b0\u4e00\u4e9b\u968f\u673a\u6570\u76f8\u5173\u7684\u4e1c\u897f\uff0c\u7b2c\u4e00\u6b65\u7684\u7b80\u5355\u5e94\u7528\u8bf7\u53c2\u8003\u6211\u7684\u53e6\u4e00\u7bc7\u6587\u7ae0\uff0c<a title=\"CUDA\u7b80\u5355\u79d1\u5b66\u8fd0\u7b97\u5c1d\u8bd5\uff1aMake It Quick and Clean!\" href=\"http:\/\/blog.stlover.org\/?p=190\">\u5bf9\u4e8e\u4f7f\u7528CUDA\u505a\u8499\u7279\u5361\u6d1b\u79ef\u5206\u7684\u4e00\u4e9b\u63a2\u8ba8<\/a><\/p>\n<p>\u9996\u5148\uff1a\u5b66\u4e60\u8d44\u6599<\/p>\n<p>http:\/\/docs.nvidia.com\/cuda\/curand\/introduction.html<\/p>\n<p>\u6587\u7ae0\u4e2d\u5f88\u591a\u5730\u65b9\u6211\u5c31\u6458\u6284guide\u7684\u539f\u8bdd\u4e86\uff0c\u907f\u514d\u56e0\u4e3a\u7ffb\u8bd1\u9020\u6210\u7684\u7406\u89e3\u8bef\u5dee\u3002<\/p>\n<p>\u4f5c\u4e3a\u4e00\u4e2a\u4e25\u8c28\u7684\u7f16\u7a0b\u8005\uff0c\u5f88\u591a\u4e8b\u60c5\u4e0d\u4ec5\u8981\u8003\u8651\u7cfb\u7edfAPI\u7684\u901f\u5ea6\uff0c\u4e5f\u8981\u5bf9\u5176\u4e25\u8c28\u6027\u5ba1\u67e5\uff0c\u5c24\u5176\u8499\u7279\u5361\u6d1b\u8fd9\u79cd\u9760\u968f\u673a\u6570\u8d77\u5bb6\u7684\u73a9\u610f\u3002\u597d\u5728CUDA\u7ed9\u6211\u4eec\u63d0\u4f9b\u4e86\u4e00\u5957\u975e\u5e38\u9760\u8c31\u7684\u968f\u673a\u6570\u4ea7\u751f\u5668<\/p>\n<blockquote><p>The CURAND library provides facilities that focus on the simple and efficient generation of high-quality pseudorandom and quasirandom numbers.<\/p><\/blockquote>\n<p>&#8211;\u6458\u81eaintroduction<\/p>\n<p>\u8fd9\u4e2aquasirandom\u8bf4\u7684\u53e3\u6c14\u5f88\u5927\u561b\uff01\u4e01\u6cfd\u519b\u8001\u5e08\u7684\u8ba1\u7b97\u7269\u7406\u8bfe\u4e5f\u5c31\u6562\u8bf4\u4f2a\u968f\u673a\u800c\u5df2\u3002<\/p>\n<p>\u90a3\u6211\u4eec\u6765\u4e00\u63a2\u7a76\u7adf\u5427\uff01<\/p>\n<p>\u9996\u5148\uff0cCUDA\u7684\u968f\u673a\u6570\u63d0\u4f9b\u5206\u4e3a\u4e24\u79cd\u7c7b\u578b\uff0c\u7b2c\u4e00\u662f\u7531GPU\u8fd0\u7b97\u5e76\u4e14\u4ea4\u7ed9\u4e3b\u673a\u7684\u968f\u673a\uff0c\u8ba9\u6211\u4e00\u4e0b\u5b50\u5c31\u60f3\u5230\u4e86\u8ba1\u7b97\u7269\u7406\u8bfe\u4e0a\u7684\u90a3\u4e2a\u201c\u968f\u673a\u6570\u786c\u4ef6\u4ea7\u751f\u5668\u201d\uff0c\u5927\u62b5\u5c31\u662f\u5982\u6b64\u4e86\u3002<\/p>\n<p>\u7528\u8fc7\u4e2d\u56fd\u94f6\u884c\u7f51\u94f6\u7684\u5b69\u5b50\u4f30\u8ba1\u89c1\u8fc7\u90a3\u79cd\u9ad8\u5927\u4e0a\u7684\u5bc6\u7801\u751f\u6210\u5668\uff0c\u5b83\u91c7\u7528\u6bcf\u5206\u949f\u7ed9\u51fa\u4e00\u4e2a6\u4f4d\u6570\u5b57\u7684\u5e8f\u5217\uff0c\u5e76\u548c\u7f51\u7edc\u670d\u52a1\u5668\u540c\u6b65\uff0c\u6709\u6548\u671f\u6709\u51e0\u5e74\uff0c\u7b97\u4e0b\u6765\u8fd9\u4e2a\u7ed9\u51fa\u968f\u673a\u6570\u7684\u5c0f\u73a9\u610f\u8981\u5728\u5176\u4e00\u751f\u4e2d\u628a6\u4f4d\u6570\u7684\u4e00\u534a\u4f5c\u4e3a\u6392\u5e8f\u7ed9\u51fa\uff0c\u5e76\u4e14\u8981\u548c\u7f51\u7edc\u670d\u52a1\u5668\u8fbe\u5230\u5206\u949f\u7684\u7cbe\u5ea6\uff0c\u90a3\u4e48\u5fc5\u7136\u662f\u5b58\u50a8\u597d\u7684\u4e86\u3002\u4e14\u4e0d\u8bba\u5176\u5b89\u5168\u6027\uff08\u4ee5\u540e\u6709\u673a\u4f1a\u53ef\u4ee5\u7814\u7a76\u4e0b\uff09\uff0c\u4f46\u5c31\u8fd9\u6837\u4e00\u4e2a\u8d85\u5927\u89c4\u6a21\u53c8\u6d89\u53ca\u91d1\u878d\u5b89\u5168\u7684\u73a9\u610f\uff0c\u5982\u679c\u4f60\u62ff16807\u7b97\u6cd5\u6765\u751f\u6210\u968f\u673a\u5e8f\u5217\u8fd8\u771f\u4e0d\u600e\u4e48\u9760\u8c31\u3002<\/p>\n<p>\u53e6\u4e00\u79cd\u5c31\u662f\u5728device\u4e0a\u76f4\u63a5\u8fd0\u884c\u7684\u968f\u673a\u6570\u4ea7\u751f\u5668\uff0c\u7531\u4e8e\u4e34\u8fd1\u8003\u8bd5\u5728\u6562\u4f5c\u4e1a\uff0c\u5c31\u5148\u4ecedevice\u7684\u968f\u673a\u6570\u4ea7\u751f\u5668\u8bb2\u8d77\u5427\u3002<\/p>\n<p>\u5173\u4e8edevice api\uff0cNVIDIA\u8fd9\u4e48\u53d9\u8ff0<\/p>\n<blockquote><p>To use the device API, include the file\u00a0curand_kernel.h\u00a0in files that define kernels that use CURAND device functions.The device API includes functions\u00a0<a href=\"http:\/\/translate.googleusercontent.com\/translate_c?act=url&amp;depth=1&amp;hl=zh-CN&amp;ie=UTF8&amp;prev=_t&amp;rurl=translate.google.com.hk&amp;sl=en&amp;tl=zh-CN&amp;u=http:\/\/docs.nvidia.com\/cuda\/curand\/device-api-overview.html&amp;usg=ALkJrhjYsUR1wIujP6ry5DKptugMn21-Yg#pseudorandom-sequences\">pseudorandom generation<\/a>\u00a0for and\u00a0<a href=\"http:\/\/translate.googleusercontent.com\/translate_c?act=url&amp;depth=1&amp;hl=zh-CN&amp;ie=UTF8&amp;prev=_t&amp;rurl=translate.google.com.hk&amp;sl=en&amp;tl=zh-CN&amp;u=http:\/\/docs.nvidia.com\/cuda\/curand\/device-api-overview.html&amp;usg=ALkJrhjYsUR1wIujP6ry5DKptugMn21-Yg#quasirandom-sequences\">quasirandom generation<\/a>\u00a0.<\/p><\/blockquote>\n<p>CUDA\u91c7\u7528\u51e0\u79cd\u968f\u673a\u6570\u751f\u6210\u7b97\u6cd5\uff0c\u5305\u62ec\uff1a<\/p>\n<blockquote><p>\u00a0XORWOW\uff0c\u00a0MRG32k3a\uff0c\u00a0MTGP32<\/p><\/blockquote>\n<p>\u5177\u4f53\u7684\u7b97\u6cd5Google\u5427\u3002\u4ee5\u540e\u518d\u5206\u6790\u3002\uff08\u8c01\u521a\u521a\u8fd8\u5728\u8bf4\u81ea\u5df1\u662f\u4e25\u8c28\u7684\u7f16\u7a0b\u8005\u5462\u3002\u3002\u53e3\u4ea8\u3002\u3002\u3002\uff09<\/p>\n<p>\u4e8e\u662f\uff0c\u968f\u673a\u6570\u4ea7\u751f\u5668\u4e00\u822c\u8981\u505a\u4e14\u53ea\u9700\u8981\u4e24\u4ef6\u4e8b\u60c5\uff0c\u521d\u59cb\u5316\u4e0e\u83b7\u5f97\u503c\uff0c\u5927\u7ea6\u8fd8\u8bb0\u5f97Pascal\u7684\u968f\u673a\u6570\u4ea7\u751f\u5668\u662f\u8fd9\u4e48\u5199\u7684<\/p>\n<ol class=\"linenums\">\n<li class=\"L0\"><span class=\"pun\">&#8230;<\/span><\/li>\n<li class=\"L1\"><span class=\"pln\">randomize<\/span><span class=\"pun\">();<\/span><\/li>\n<li class=\"L2\"><span class=\"pln\">a<\/span><span class=\"pun\">:=<\/span><span class=\"pln\">random<\/span><span class=\"pun\">();<\/span><\/li>\n<li class=\"L3\"><span class=\"pun\">&#8230;<\/span><\/li>\n<\/ol>\n<p>\u975e\u5e38\u7701\u65f6\u7701\u5fc3\uff0c\u53ef\u5bf9\u4e8eCUDA\u6211\u4eec\u5c31\u5f97\u4e00\u6b65\u6b65\u6765\u54af\u3002<\/p>\n<p>\u521d\u59cb\u5316\u51fd\u6570\u7684\u5b9a\u4e49\u5982\u4e0b\uff1a<\/p>\n<ol class=\"linenums\">\n<li class=\"L0\"><span class=\"pln\">__device__\u00a0<\/span><span class=\"kwd\">void<\/span><\/li>\n<li class=\"L1\"><span class=\"pln\">curand_init\u00a0<\/span><span class=\"pun\">(<\/span><\/li>\n<li class=\"L2\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"kwd\">unsigned<\/span><span class=\"pln\">\u00a0<\/span><span class=\"kwd\">long<\/span><span class=\"pln\">\u00a0<\/span><span class=\"kwd\">long<\/span><span class=\"pln\">\u00a0seed<\/span><span class=\"pun\">,<\/span><span class=\"pln\">\u00a0<\/span><span class=\"kwd\">unsigned<\/span><span class=\"pln\">\u00a0<\/span><span class=\"kwd\">long<\/span><span class=\"pln\">\u00a0<\/span><span class=\"kwd\">long<\/span><span class=\"pln\">\u00a0sequence<\/span><span class=\"pun\">,<\/span><\/li>\n<li class=\"L3\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"kwd\">unsigned<\/span><span class=\"pln\">\u00a0<\/span><span class=\"kwd\">long<\/span><span class=\"pln\">\u00a0<\/span><span class=\"kwd\">long<\/span><span class=\"pln\">\u00a0offset<\/span><span class=\"pun\">,<\/span><span class=\"pln\">\u00a0<\/span><span class=\"typ\">curandState_t<\/span><span class=\"pln\">\u00a0<\/span><span class=\"pun\">*<\/span><span class=\"pln\">state<\/span><span class=\"pun\">)<\/span><\/li>\n<\/ol>\n<p>\u539f\u6587\u89e3\u91ca\u5982\u4e0b<\/p>\n<blockquote><p>The\u00a0<samp>curand_init()<\/samp>\u00a0function sets up an initial state allocated by the caller using the given seed, sequence number, and offset within the sequence. Different seeds are guaranteed to produce different starting states and different sequences. The same seed always produces the same state and the same sequence. The state set up will be the state after<\/p>\n<math>2<\/math>\n<p>67\u00a0\u22c5\u00a0<samp>sequence<\/samp>\u00a0+\u00a0<samp>offset<\/samp>\u00a0calls to\u00a0<samp>curand()<\/samp>\u00a0from the seed state.<\/p><\/blockquote>\n<p>CUDA\u53ef\u4ee5\u4ea7\u751f\u82e5\u5e72\u968f\u673a\u5e8f\u5217\uff0c\u6bcf\u4e00\u4e2a\u968f\u673a\u5e8f\u5217\u4ea7\u751f\u6709\u4e24\u4e2a\u5c5e\u6027\uff1a\u7b2c\u4e00\u662f\u4ea7\u751f\u5176\u7684\u79cd\u5b50\uff0c\u4e5f\u5c31\u662fseed\uff0c\u53e6\u4e00\u662f\u968f\u673a\u6570\u4ea7\u751f\u7684\u504f\u79fb\u91cf\uff0c\u56e0\u4e3a\u6211\u4eec\u77e5\u9053\u968f\u673a\u6570\u4ea7\u751f\u90fd\u662f\u4e00\u4e2a\u8fed\u4ee3\u8fc7\u7a0b\uff0cCUDA\u63d0\u4f9b\u4e86\u5728\u521d\u59cb\u5316\u65f6\u5019\u5c31\u8fed\u4ee3\u82e5\u5e72\u6b21\u7684\u7b97\u6cd5\u6765\u89e3\u51b3\u79cd\u5b50\u8fc7\u5c11\u7684\u95ee\u9898\u3002<\/p>\n<p>\u4f20\u7edf\u800c\u5e38\u89c1\u7684\u79cd\u5b50\u63d0\u4f9b\u505a\u6cd5\u662f\u4f7f\u7528\u5f53\u524d\u7cfb\u7edf\u65f6\u95f4\uff08\u5fae\u79d2\u6570\uff09\uff0c\u6765\u4f5c\u4e3a\u79cd\u5b50\u7ed9\u51fa\uff0c\u5bf9\u4e8e\u4e32\u884c\u7a0b\u5e8f\u8fd9\u7b80\u5355\u800c\u6709\u6548\uff0c\u4f46\u662f\u663e\u7136\u5bf9\u4e8e\u5e76\u884c\u7a0b\u5e8f\uff0c\u8fd9\u5e76\u975e\u660e\u667a\u7684\u505a\u6cd5\uff0c\u56e0\u4e3a\u6211\u4eec\u5728\u4e3b\u673a\u542f\u52a8GPU\u5185\u6838\u7684\u65f6\u5019\u65f6\u95f4\u662f\u4e00\u5b9a\u7684\u4e14\u53ea\u80fd\u7528\u8fd9\u4e2a\u503c\u4f5c\u4e3a\u79cd\u5b50\uff0c\u5373\u4f7f\u8fdb\u884c\u4e00\u4e9b\u5904\u7406\u4e5f\u672a\u5fc5\u53ef\u4ee5\u8fbe\u5230\u8981\u6c42\uff0c\u4e8e\u662f\u4e4e\uff0cCUDA\u63d0\u4f9b\u4e86\u8fd9\u6837\u4e00\u4e2a\u89e3\u51b3\u65b9\u6848\uff0c\u5bf9\u4e8e\u975e\u4e25\u8c28\u7684\u4f7f\u7528\u8005\u53cd\u800c\u66f4\u52a0\u4f7f\u7528\u3002<\/p>\n<p>\u5bf9\u4e8eCUDA\u7684\u6bcf\u4e2a\u968f\u673a\u5e8f\u5217\uff0c\u4f7f\u7528\u4e00\u4e2aState\u7ed9\u51fa\u6807\u5fd7\uff0c\u7136\u540e\u5728\u8c03\u7528\u67d0\u4e00\u5e8f\u5217\u7684\u4e0b\u4e00\u4e2a\u503c\u7684\u65f6\u5019\u4f7f\u7528<\/p>\n<ol class=\"linenums\">\n<li class=\"L0\"><span class=\"pln\">__device__\u00a0<\/span><span class=\"kwd\">unsigned<\/span><span class=\"pln\">\u00a0<\/span><span class=\"typ\">int<\/span><\/li>\n<li class=\"L1\"><span class=\"pln\">curand\u00a0<\/span><span class=\"pun\">(<\/span><span class=\"typ\">curandState_t<\/span><span class=\"pln\">\u00a0<\/span><span class=\"pun\">*<\/span><span class=\"pln\">state<\/span><span class=\"pun\">);<\/span><\/li>\n<\/ol>\n<p>\u5373\u53ef\u3002\u7ed9\u51fa\u4f8b\u5b50\uff0c\u5b8c\u6574\u4ee3\u7801\u8bf7\u53c2\u89c1<a title=\"GitHub\" href=\"https:\/\/github.com\/xuhao1\/CUDA_Learning\" target=\"_blank\">\u6211\u7684GitHub<\/a><\/p>\n<p>\u5bf9\u4e8e\u5185\u6838\u4ee3\u7801<\/p>\n<ol class=\"linenums\">\n<li class=\"L0\"><span class=\"pln\">__global__\u00a0<\/span><span class=\"kwd\">void<\/span><span class=\"pln\">\u00a0random_gpu<\/span><span class=\"pun\">(<\/span><span class=\"kwd\">long<\/span><span class=\"pun\">*<\/span><span class=\"pln\">\u00a0C<\/span><span class=\"pun\">,<\/span><span class=\"kwd\">long<\/span><span class=\"pun\">*<\/span><span class=\"pln\">\u00a0time<\/span><span class=\"pun\">,<\/span><span class=\"pln\">curandState<\/span><span class=\"pun\">*<\/span><span class=\"pln\">state<\/span><span class=\"pun\">)<\/span><\/li>\n<li class=\"L1\"><span class=\"pun\">{<\/span><\/li>\n<li class=\"L2\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"kwd\">long<\/span><span class=\"pln\">\u00a0i\u00a0<\/span><span class=\"pun\">=<\/span><span class=\"pln\">\u00a0threadIdx<\/span><span class=\"pun\">.<\/span><span class=\"pln\">x<\/span><span class=\"pun\">;<\/span><\/li>\n<li class=\"L3\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"kwd\">long<\/span><span class=\"pln\">\u00a0seed<\/span><span class=\"pun\">=(*<\/span><span class=\"pln\">time<\/span><span class=\"pun\">)*(<\/span><span class=\"pln\">i<\/span><span class=\"pun\">+<\/span><span class=\"lit\">1<\/span><span class=\"pun\">);<\/span><span class=\"com\">\/\/\u56e0\u4e3a\u6240\u6709\u7ed9\u5b9a\u65f6\u95f4\u4e00\u5b9a\uff0c\u6240\u4ee5\u6211\u4eec\u53ea\u80fd\u901a\u8fc7\u5bf9\u65f6\u95f4\u8fdb\u884c\u7b80\u5355\u5904\u7406<\/span><\/li>\n<li class=\"L4\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"typ\">int<\/span><span class=\"pln\">\u00a0offset<\/span><span class=\"pun\">=<\/span><span class=\"lit\">0<\/span><span class=\"pun\">;<\/span><span class=\"com\">\/\/\u5b8c\u5168\u72ec\u7acb\u7684\u5e8f\u5217\uff0c\u6240\u4ee5offset\u5168\u90e8\u4e3a\u96f6\u6765\u8282\u7ea6\u65f6\u95f4<\/span><\/li>\n<li class=\"L5\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0curand_init\u00a0<\/span><span class=\"pun\">(<\/span><span class=\"pln\">seed<\/span><span class=\"pun\">,<\/span><span class=\"pln\">i<\/span><span class=\"pun\">,<\/span><span class=\"pln\">offset<\/span><span class=\"pun\">,&gt;<\/span><span class=\"pln\">state<\/span><span class=\"pun\">[<\/span><span class=\"pln\">i<\/span><span class=\"pun\">]);<\/span><span class=\"com\">\/\/\u8bbe\u7f6e\u7b2ci\u4e2a\u968f\u673a\u5e8f\u5217<\/span><\/li>\n<li class=\"L6\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0C<\/span><span class=\"pun\">[<\/span><span class=\"pln\">i<\/span><span class=\"pun\">]=<\/span><span class=\"pln\">curand<\/span><span class=\"pun\">(&gt;<\/span><span class=\"pln\">state<\/span><span class=\"pun\">[<\/span><span class=\"pln\">i<\/span><span class=\"pun\">]);<\/span><span class=\"com\">\/\/\u83b7\u5f97\u7b2ci\u4e2a\u968f\u673a\u5e8f\u5217\u7684\u968f\u673a\u503c<\/span><\/li>\n<li class=\"L7\"><span class=\"pun\">}<\/span><\/li>\n<\/ol>\n<p>\u5bf9\u4e8e\u4e3b\u51fd\u6570<\/p>\n<ol class=\"linenums\">\n<li class=\"L0\"><span class=\"typ\">int<\/span><span class=\"pln\">\u00a0main<\/span><span class=\"pun\">()<\/span><\/li>\n<li class=\"L1\"><span class=\"pun\">{<\/span><\/li>\n<li class=\"L2\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"typ\">size_t<\/span><span class=\"pln\">\u00a0size\u00a0<\/span><span class=\"pun\">=<\/span><span class=\"pln\">\u00a0N\u00a0<\/span><span class=\"pun\">*<\/span><span class=\"pln\">\u00a0<\/span><span class=\"kwd\">sizeof<\/span><span class=\"pun\">(<\/span><span class=\"typ\">float<\/span><span class=\"pun\">);<\/span><\/li>\n<li class=\"L3\"><span class=\"pln\">\u00a0<\/span><\/li>\n<li class=\"L4\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"kwd\">long<\/span><span class=\"pun\">*<\/span><span class=\"pln\">\u00a0C<\/span><span class=\"pun\">=<\/span><span class=\"kwd\">new<\/span><span class=\"pln\">\u00a0<\/span><span class=\"kwd\">long<\/span><span class=\"pun\">[<\/span><span class=\"pln\">N<\/span><span class=\"pun\">];<\/span><\/li>\n<li class=\"L5\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"kwd\">long<\/span><span class=\"pln\">\u00a0st<\/span><span class=\"pun\">=<\/span><span class=\"pln\">getCurrentTime<\/span><span class=\"pun\">();<\/span><\/li>\n<li class=\"L6\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0curandState\u00a0<\/span><span class=\"pun\">*<\/span><span class=\"pln\">state<\/span><span class=\"pun\">;<\/span><\/li>\n<li class=\"L7\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"kwd\">long<\/span><span class=\"pln\">\u00a0<\/span><span class=\"pun\">*<\/span><span class=\"pln\">d_C<\/span><span class=\"pun\">;<\/span><\/li>\n<li class=\"L8\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0cudaMalloc<\/span><span class=\"pun\">(&gt;<\/span><span class=\"pln\">state<\/span><span class=\"pun\">,<\/span><span class=\"kwd\">sizeof<\/span><span class=\"pun\">(<\/span><span class=\"pln\">curandState<\/span><span class=\"pun\">)*<\/span><span class=\"pln\">N<\/span><span class=\"pun\">);<\/span><span class=\"com\">\/\/\u8bbe\u7acb\u968f\u673a\u72b6\u6001\u5217<\/span><\/li>\n<li class=\"L9\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0cudaMalloc<\/span><span class=\"pun\">(&gt;<\/span><span class=\"pln\">d_C<\/span><span class=\"pun\">,<\/span><span class=\"pln\">\u00a0size<\/span><span class=\"pun\">);<\/span><\/li>\n<li class=\"L0\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0random_gpu<\/span><span class=\"pun\">&lt;&lt;&lt;<\/span><span class=\"lit\">1<\/span><span class=\"pun\">,<\/span><span class=\"pln\">N<\/span><span class=\"pun\">&gt;&gt;&gt;(<\/span><span class=\"pln\">d_C<\/span><span class=\"pun\">,<\/span><span class=\"pln\">getCurrentTimeForDev<\/span><span class=\"pun\">(),<\/span><span class=\"pln\">state<\/span><span class=\"pun\">);<\/span><\/li>\n<li class=\"L1\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0cudaMemcpy<\/span><span class=\"pun\">(<\/span><span class=\"pln\">C<\/span><span class=\"pun\">,<\/span><span class=\"pln\">\u00a0d_C<\/span><span class=\"pun\">,<\/span><span class=\"pln\">\u00a0size<\/span><span class=\"pun\">,<\/span><span class=\"pln\">\u00a0cudaMemcpyDeviceToHost<\/span><span class=\"pun\">);<\/span><\/li>\n<li class=\"L2\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"kwd\">long<\/span><span class=\"pln\">\u00a0ed<\/span><span class=\"pun\">=<\/span><span class=\"pln\">getCurrentTime<\/span><span class=\"pun\">();<\/span><\/li>\n<li class=\"L3\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0printf<\/span><span class=\"pun\">(<\/span><span class=\"str\">&#8220;gpu\u00a0running\u00a0time:%ld\\n&#8221;<\/span><span class=\"pun\">,<\/span><span class=\"pln\">ed<\/span><span class=\"pun\">&#8211;<\/span><span class=\"pln\">st<\/span><span class=\"pun\">);<\/span><\/li>\n<li class=\"L4\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0cudaFree<\/span><span class=\"pun\">(<\/span><span class=\"pln\">d_C<\/span><span class=\"pun\">);<\/span><\/li>\n<li class=\"L5\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"kwd\">for<\/span><span class=\"pun\">(<\/span><span class=\"typ\">int<\/span><span class=\"pln\">\u00a0i<\/span><span class=\"pun\">=<\/span><span class=\"lit\">0<\/span><span class=\"pun\">;<\/span><span class=\"pln\">i<\/span><span class=\"pun\">&lt;<\/span><span class=\"lit\">10<\/span><span class=\"pun\">;<\/span><span class=\"pln\">i<\/span><span class=\"pun\">++)<\/span><\/li>\n<li class=\"L6\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"pun\">{<\/span><\/li>\n<li class=\"L7\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0printf<\/span><span class=\"pun\">(<\/span><span class=\"str\">&#8220;%ld\u00a0&#8220;<\/span><span class=\"pun\">,<\/span><span class=\"pln\">C<\/span><span class=\"pun\">[<\/span><span class=\"pln\">i<\/span><span class=\"pun\">]);<\/span><\/li>\n<li class=\"L8\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"pun\">}<\/span><\/li>\n<li class=\"L9\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"kwd\">delete<\/span><span class=\"pun\">[]<\/span><span class=\"pln\">\u00a0C<\/span><span class=\"pun\">;<\/span><\/li>\n<li class=\"L0\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0printf<\/span><span class=\"pun\">(<\/span><span class=\"str\">&#8220;\\n&#8221;<\/span><span class=\"pun\">);<\/span><\/li>\n<li class=\"L1\"><span class=\"pun\">}<\/span><\/li>\n<\/ol>\n<p>\u4e8e\u662f\u5c31\u5b8c\u6210\u4e86\uff0c\u8fd8\u6709\u51e0\u4e2a\u7ec6\u8282\u9700\u8981\u5904\u7406\uff0c\u56e0\u4e3aCUDA\u7565\u5fae\u7e41\u7410\u7684\u5185\u5b58\u7ba1\u7406\uff0c\u6240\u6709\u8f93\u5165\u503c\u4ec5\u4ec5\u63d0\u4f9b\u6307\u9488\u7684\u5f62\u5f0f\u7ed9\u51fa\uff0c\u6240\u4ee5\u6211\u4eec\u7528getCurrentTimeForDev()\u6765\u76f4\u63a5\u7ed9\u51fadevice\u6307\u9488\u7684\u65f6\u95f4<\/p>\n<ol class=\"linenums\">\n<li class=\"L0\"><span class=\"kwd\">long<\/span><span class=\"pln\">\u00a0getCurrentTime<\/span><span class=\"pun\">()<\/span><span class=\"pln\">\u00a0\u00a0<\/span><\/li>\n<li class=\"L1\"><span class=\"pun\">{<\/span><span class=\"pln\">\u00a0\u00a0<\/span><\/li>\n<li class=\"L2\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"kwd\">struct<\/span><span class=\"pln\">\u00a0timeval\u00a0tv<\/span><span class=\"pun\">;<\/span><span class=\"pln\">\u00a0\u00a0<\/span><\/li>\n<li class=\"L3\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0gettimeofday<\/span><span class=\"pun\">(&gt;<\/span><span class=\"pln\">tv<\/span><span class=\"pun\">,<\/span><span class=\"pln\">NULL<\/span><span class=\"pun\">);<\/span><span class=\"pln\">\u00a0\u00a0<\/span><\/li>\n<li class=\"L4\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"kwd\">return<\/span><span class=\"pln\">\u00a0tv<\/span><span class=\"pun\">.<\/span><span class=\"pln\">tv_sec\u00a0<\/span><span class=\"pun\">*<\/span><span class=\"pln\">\u00a0<\/span><span class=\"lit\">1000000<\/span><span class=\"pln\">\u00a0<\/span><span class=\"pun\">+<\/span><span class=\"pln\">\u00a0tv<\/span><span class=\"pun\">.<\/span><span class=\"pln\">tv_usec<\/span><span class=\"pun\">;<\/span><span class=\"pln\">\u00a0\u00a0<\/span><\/li>\n<li class=\"L5\"><span class=\"pun\">}<\/span><\/li>\n<li class=\"L6\"><span class=\"kwd\">long<\/span><span class=\"pun\">*<\/span><span class=\"pln\">getCurrentTimeForDev<\/span><span class=\"pun\">()<\/span><\/li>\n<li class=\"L7\"><span class=\"pun\">{<\/span><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"kwd\">long<\/span><span class=\"pln\">\u00a0<\/span><span class=\"pun\">*<\/span><span class=\"pln\">time<\/span><span class=\"pun\">;<\/span><\/li>\n<li class=\"L8\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0cudaMalloc<\/span><span class=\"pun\">(&gt;<\/span><span class=\"pln\">time<\/span><span class=\"pun\">,<\/span><span class=\"kwd\">sizeof<\/span><span class=\"pun\">(<\/span><span class=\"kwd\">long<\/span><span class=\"pun\">));<\/span><\/li>\n<li class=\"L9\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"kwd\">long<\/span><span class=\"pln\">\u00a0<\/span><span class=\"pun\">*<\/span><span class=\"pln\">timenow<\/span><span class=\"pun\">=<\/span><span class=\"kwd\">new<\/span><span class=\"pln\">\u00a0<\/span><span class=\"kwd\">long<\/span><span class=\"pun\">;<\/span><\/li>\n<li class=\"L0\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"pun\">*<\/span><span class=\"pln\">timenow<\/span><span class=\"pun\">=<\/span><span class=\"pln\">getCurrentTime<\/span><span class=\"pun\">();<\/span><\/li>\n<li class=\"L1\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0cudaMemcpy<\/span><span class=\"pun\">(<\/span><span class=\"pln\">time<\/span><span class=\"pun\">,<\/span><span class=\"pln\">timenow<\/span><span class=\"pun\">,<\/span><span class=\"kwd\">sizeof<\/span><span class=\"pun\">(<\/span><span class=\"kwd\">long<\/span><span class=\"pun\">),<\/span><span class=\"pln\">cudaMemcpyHostToDevice<\/span><span class=\"pun\">);<\/span><\/li>\n<li class=\"L2\"><span class=\"pln\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"kwd\">return<\/span><span class=\"pln\">\u00a0time<\/span><span class=\"pun\">;<\/span><\/li>\n<li class=\"L3\"><span class=\"pun\">}<\/span><\/li>\n<\/ol>\n<p>\u7136\u540e\u8fd0\u884c\u4e0b\uff1a<\/p>\n<ol class=\"linenums\">\n<li class=\"L0\"><span class=\"pln\">CUDA_Learning$\u00a0<\/span><span class=\"pun\">.\/<\/span><span class=\"pln\">rand<\/span><\/li>\n<li class=\"L1\"><span class=\"pln\">gpu\u00a0running\u00a0time<\/span><span class=\"pun\">:<\/span><span class=\"lit\">490969<\/span><\/li>\n<li class=\"L2\"><span class=\"lit\">3500780692<\/span><span class=\"pln\">\u00a0<\/span><span class=\"lit\">144000514<\/span><span class=\"pln\">\u00a0<\/span><span class=\"lit\">797145604<\/span><span class=\"pln\">\u00a0<\/span><span class=\"lit\">4199381721<\/span><span class=\"pln\">\u00a0<\/span><span class=\"lit\">2915119786<\/span><span class=\"pln\">\u00a0<\/span><span class=\"lit\">2557417372<\/span><span class=\"pln\">\u00a0<\/span><span class=\"lit\">4151631408<\/span><span class=\"pln\">\u00a0<\/span><span class=\"lit\">1974695633<\/span><span class=\"pln\">\u00a0<\/span><span class=\"lit\">2578972200<\/span><span class=\"pln\">\u00a0<\/span><span class=\"lit\">2865224429<\/span><span class=\"pln\">\u00a0<\/span><\/li>\n<\/ol>\n<p>\u968f\u673a\u6570\u751f\u6210\u8fd9\u4e8b\u60c5\u786e\u5b9e\u6162\u7684\u60ca\u4eba\u3002\u3002\u3002\u5c45\u7136\u8dd1\u4e86\u534a\u79d2\u3002<\/p>\n<p>\u4e0b\u9762\u7684\u6837\u4f8b\u76f4\u63a5\u5207\u5230<a title=\"CUDA\u7b80\u5355\u79d1\u5b66\u8fd0\u7b97\u5c1d\u8bd5\uff1aMake It Quick and Clean!\" href=\"http:\/\/blog.stlover.org\/?p=190\" target=\"_blank\">\u7528\u8fd9\u4e2a\u968f\u673a\u6570\u5e8f\u5217\u6765\u505a\u8499\u7279\u5361\u6d1b\u79ef\u5206\u4e86<\/a>\u3002\u8bf7\u6233\u76f8\u5e94\u94fe\u63a5\u3002<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u7531\u4e8e\u968f\u673a\u6570\u662f\u6574\u4e2a\u8ba1\u7b97\u7269\u7406\u7684\u57fa\u77f3\uff0c\u5b9e\u5728\u592a\u8fc7\u4e8e\u91cd\u8981\uff0c\u5e72\u8106\u7528\u4e00\u7247\u6587\u7ae0\u6765\u53d9\u8ff0CUDA\u6709\u5173\u4e8e\u968f\u673a\u6570\u4ea7\u751f\u5668\u7684\u79cd\u79cd\u3002<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[5,7],"tags":[],"_links":{"self":[{"href":"http:\/\/blog.xuhao1.me\/index.php?rest_route=\/wp\/v2\/posts\/188"}],"collection":[{"href":"http:\/\/blog.xuhao1.me\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/blog.xuhao1.me\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/blog.xuhao1.me\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/blog.xuhao1.me\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=188"}],"version-history":[{"count":4,"href":"http:\/\/blog.xuhao1.me\/index.php?rest_route=\/wp\/v2\/posts\/188\/revisions"}],"predecessor-version":[{"id":199,"href":"http:\/\/blog.xuhao1.me\/index.php?rest_route=\/wp\/v2\/posts\/188\/revisions\/199"}],"wp:attachment":[{"href":"http:\/\/blog.xuhao1.me\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=188"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/blog.xuhao1.me\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=188"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/blog.xuhao1.me\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=188"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}