{"id":359,"date":"2016-09-03T01:28:17","date_gmt":"2016-09-03T01:28:17","guid":{"rendered":"http:\/\/evolvedmicrobe.com\/blogs\/?p=359"},"modified":"2016-09-03T01:46:24","modified_gmt":"2016-09-03T01:46:24","slug":"profiling-rcpp-package-code-on-windows","status":"publish","type":"post","link":"http:\/\/evolvedmicrobe.com\/blogs\/?p=359","title":{"rendered":"Profiling Rcpp package code on Windows"},"content":{"rendered":"Profiling Rcpp code on Unix\/Mac is easy, but is difficult on Windows because R uses a compilation toolchain (<a href=\"http:\/\/mingw.org\/\">MinGW<\/a>) that produces files that are not understood by common Windows profiling programs.\u00a0 Additionally, the R build process often removes\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Debug_symbol\">symbols<\/a> which allow profilers to produce sensible interpretations of their data.\r\n\r\nThe following steps allow one to profile Rcpp code on windows.\r\n<h2><strong>Change compilation settings to add in symbol settings<\/strong><\/h2>\r\nA default R installation typically has certain compiler settings placed in the equivalent of the <span class=\"lang:default decode:true crayon-inline\">C:\\Program Files\\R\\R-3.3.1\\etc\\x64\\Makeconf<\/span>\u00a0that strips information needed for profiling\u00a0during the Rcpp compilation process,\u00a0in particular a line which reads:\u00a0<span class=\"lang:sh decode:true crayon-inline \">DLLFLAGS=-s<\/span>\u00a0. To\u00a0override this and\u00a0add some additionally needed flags,\u00a0one should\u00a0add\u00a0a\u00a0folder and file to their home directory which overrides and appends necessesary compilation flags.\u00a0 To a file located at a location equivalent to\u00a0<span class=\"lang:sh decode:true crayon-inline \">C:\\Users\\YOURNAME\\.R\\Makevars <\/span>\u00a0on your machine (note the &#8216;.&#8217; before R), add the following lines:\r\n<pre class=\"lang:sh decode:true\">CXXFLAGS+=-gdwarf-2\r\nDLLFLAGS=\r\n<\/pre>\r\nYou can verify this worked correctly by checking that <code>-gdwarf-2<\/code> appears in the compilation messages, and that <code>-s<\/code> is missing in the final linker step.\r\n<h2>Run a profiler which understands MinGW compiled code<\/h2>\r\nThe next key step is to run a profiler which can understand the Unix like symbols on windows.\u00a0 Two free and\u00a0good options are <a href=\"http:\/\/www.codersnotes.com\/sleepy\/\">Very Sleepy<\/a> and <a href=\"http:\/\/developer.amd.com\/tools-and-sdks\/archive\/compute\/amd-codeanalyst-performance-analyzer\/\">AMD&#8217;s code analyst<\/a>\u00a0(which also works on Intel chips).\u00a0 Very Sleepy is very good at basic timings and providing stack traces, while AMD&#8217;s profiler is\u00a0able to drill down to the assembly of a process. Both profilers\u00a0are good but an example with AMD is shown below.\r\n<ol>\r\n \t<li>Open the program and setup a quick session to start and run a sample R script that uses your code, such as in the example shown below.<a href=\"https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2016\/09\/AMD_ProfilerSettings.png\"><img data-attachment-id=\"364\" data-permalink=\"http:\/\/evolvedmicrobe.com\/blogs\/?attachment_id=364\" data-orig-file=\"https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2016\/09\/AMD_ProfilerSettings.png?fit=535%2C324\" data-orig-size=\"535,324\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"AMD_ProfilerSettings\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2016\/09\/AMD_ProfilerSettings.png?fit=300%2C182\" data-large-file=\"https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2016\/09\/AMD_ProfilerSettings.png?fit=535%2C324\" loading=\"lazy\" class=\"size-full wp-image-364 aligncenter\" src=\"https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2016\/09\/AMD_ProfilerSettings.png?resize=535%2C324\" alt=\"AMD_ProfilerSettings\" width=\"535\" height=\"324\" srcset=\"https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2016\/09\/AMD_ProfilerSettings.png?w=535 535w, https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2016\/09\/AMD_ProfilerSettings.png?resize=300%2C182 300w\" sizes=\"(max-width: 535px) 100vw, 535px\" data-recalc-dims=\"1\" \/><\/a><\/li>\r\n \t<li>Next run the profiler and get ready to look at results.\u00a0 For example, here I can see that half the time was spent in my code, versus half in the R core&#8217;s code (generating random numbers)<img data-attachment-id=\"366\" data-permalink=\"http:\/\/evolvedmicrobe.com\/blogs\/?attachment_id=366\" data-orig-file=\"https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2016\/09\/ProfilerResults1.png?fit=538%2C182\" data-orig-size=\"538,182\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"ProfilerResults1\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2016\/09\/ProfilerResults1.png?fit=300%2C101\" data-large-file=\"https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2016\/09\/ProfilerResults1.png?fit=538%2C182\" loading=\"lazy\" class=\"size-full wp-image-366 aligncenter\" src=\"https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2016\/09\/ProfilerResults1.png?resize=538%2C182\" alt=\"ProfilerResults1\" width=\"538\" height=\"182\" srcset=\"https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2016\/09\/ProfilerResults1.png?w=538 538w, https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2016\/09\/ProfilerResults1.png?resize=300%2C101 300w\" sizes=\"(max-width: 538px) 100vw, 538px\" data-recalc-dims=\"1\" \/>And digging further down I can see at the assembly level what the biggest bottlenecks were in my code<\/li>\r\n<\/ol>\r\n<a href=\"https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2016\/09\/assembly.png\"><img data-attachment-id=\"368\" data-permalink=\"http:\/\/evolvedmicrobe.com\/blogs\/?attachment_id=368\" data-orig-file=\"https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2016\/09\/assembly.png?fit=697%2C356\" data-orig-size=\"697,356\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"assembly\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2016\/09\/assembly.png?fit=300%2C153\" data-large-file=\"https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2016\/09\/assembly.png?fit=625%2C319\" loading=\"lazy\" class=\"size-full wp-image-368 aligncenter\" src=\"https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2016\/09\/assembly.png?resize=625%2C319\" alt=\"assembly\" width=\"625\" height=\"319\" srcset=\"https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2016\/09\/assembly.png?w=697 697w, https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2016\/09\/assembly.png?resize=300%2C153 300w, https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2016\/09\/assembly.png?resize=624%2C319 624w\" sizes=\"(max-width: 625px) 100vw, 625px\" data-recalc-dims=\"1\" \/><\/a>\r\n\r\nIts often helpful to look at the original source files in addition to the assembly, and this\u00a0can be enabled by setting directory information by Tools-&gt; CodeAnalyst Options -&gt; Directories.\r\n\r\n&nbsp;","protected":false},"excerpt":{"rendered":"Profiling Rcpp code on Unix\/Mac is easy, but is difficult on Windows because R uses a compilation toolchain (MinGW) that produces files that are not understood by common Windows profiling programs.\u00a0 Additionally, the R build process often removes\u00a0symbols which allow profilers to produce sensible interpretations of their data. The following steps allow one to profile [&hellip;]","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true},"categories":[1],"tags":[23,21,22],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack-related-posts":[{"id":398,"url":"http:\/\/evolvedmicrobe.com\/blogs\/?p=398","url_meta":{"origin":359,"position":0},"title":".NET Bio is Significantly Faster on .Net Core 2.0","date":"November 5, 2017","format":false,"excerpt":"Summary: With the release of .NET Core 2.0, .NET Bio is able to run significantly faster (~2X) on Mac OSX due to better compilation and memory mangement. The .NET Bio\u00a0library contains libraries for genomic data processing tasks like parsing, alignment, etc. that are too computationally intense to be\u00a0undertaken with interpreted\u2026","rel":"","context":"In \".NET Bio\"","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2017\/11\/Benchmark-1.png?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":376,"url":"http:\/\/evolvedmicrobe.com\/blogs\/?p=376","url_meta":{"origin":359,"position":1},"title":"Why R Math Functions on Windows are Slow, and How to Fix It","date":"September 19, 2016","format":false,"excerpt":"R on windows has much slower versions of the log, sine and cosine functions than are available on other platforms, and this can be a serious performance bottleneck for programs which frequently call these math functions.\u00a0 The reason for this is that the library R uses to obtain the log\u2026","rel":"","context":"In \"R\"","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":12,"url":"http:\/\/evolvedmicrobe.com\/blogs\/?p=12","url_meta":{"origin":359,"position":2},"title":"Compile Bowtie2 on Windows 64 bit.","date":"January 30, 2013","format":false,"excerpt":"Bowtie 2 is a program that efficiently aligns next generation sequence data to a reference genome. However, the version distributed by the authors only compiles on POSIX platforms. These instructions will allow you to compile it on windows by downloading the Mingw64 tools and editing the make file before building\u2026","rel":"","context":"In &quot;Computing&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2013\/01\/Capture.png?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":188,"url":"http:\/\/evolvedmicrobe.com\/blogs\/?p=188","url_meta":{"origin":359,"position":3},"title":"The .NET Bio BAM Parser is Smoking Fast","date":"October 12, 2013","format":false,"excerpt":"The .NET Bio library has an improved version of it's BAM file\u00a0parser, which makes it significantly faster and easily competitive with the\u00a0current standard C coded SAMTools for obtaining\u00a0sequencing data and working with it. The chart below compares the time it\u00a0takes in seconds for the old version of the parser and\u2026","rel":"","context":"In &quot;.NET Bio&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2013\/10\/img5.gif?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":153,"url":"http:\/\/evolvedmicrobe.com\/blogs\/?p=153","url_meta":{"origin":359,"position":4},"title":"Using Selectome with .NET Bio, F# and R","date":"September 16, 2013","format":false,"excerpt":"The Bio.Selectome namespace has features to query\u00a0Selectome.Selectome is a database that merges data from Ensembl\u00a0and the programs in PAML used to compute the ratio of non-synonymous to synonymous (dN\/dS)\u00a0mutations along various branches of the phylogenetic tree. A low dN\/dS ratio\u00a0indicates that the protein sequence is under strong selective constraint, while\u2026","rel":"","context":"In &quot;.NET Bio&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":6,"url":"http:\/\/evolvedmicrobe.com\/blogs\/?p=6","url_meta":{"origin":359,"position":5},"title":"Not All Poisson Random Variables Are Created Equally","date":"January 30, 2013","format":false,"excerpt":"Spurred by a slow running program, I spent an afternoon researching what algorithms are available for generating Poisson random variables and figuring out which methods are used by R, Matlab, NumPy, the GNU Science Libraray and various other available packages. I learned some things that I think would be useful\u2026","rel":"","context":"In &quot;Algorithms&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/evolvedmicrobe.com\/blogs\/wp-content\/uploads\/2013\/01\/img34-300x239.jpg?resize=350%2C200","width":350,"height":200},"classes":[]}],"_links":{"self":[{"href":"http:\/\/evolvedmicrobe.com\/blogs\/index.php?rest_route=\/wp\/v2\/posts\/359"}],"collection":[{"href":"http:\/\/evolvedmicrobe.com\/blogs\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/evolvedmicrobe.com\/blogs\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/evolvedmicrobe.com\/blogs\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/evolvedmicrobe.com\/blogs\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=359"}],"version-history":[{"count":13,"href":"http:\/\/evolvedmicrobe.com\/blogs\/index.php?rest_route=\/wp\/v2\/posts\/359\/revisions"}],"predecessor-version":[{"id":375,"href":"http:\/\/evolvedmicrobe.com\/blogs\/index.php?rest_route=\/wp\/v2\/posts\/359\/revisions\/375"}],"wp:attachment":[{"href":"http:\/\/evolvedmicrobe.com\/blogs\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=359"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/evolvedmicrobe.com\/blogs\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=359"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/evolvedmicrobe.com\/blogs\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=359"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}