{"id":1049,"date":"2022-12-13T05:16:21","date_gmt":"2022-12-13T11:16:21","guid":{"rendered":"https:\/\/asberry.org\/blog_tech\/?p=1049"},"modified":"2022-12-13T05:16:21","modified_gmt":"2022-12-13T11:16:21","slug":"python-read-csv","status":"publish","type":"post","link":"https:\/\/asberry.org\/blog_tech\/?p=1049","title":{"rendered":"Python &#8211; Read CSV"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">One of the more important things I need to attend to is reading a CSV file and examining it. While there is a plethora of documentation on this, since this is my blog I&#8217;m documenting my most used cases.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; light: false; title: ; toolbar: true; notranslate\" title=\"\">\ndfOriginalCSV = pd.read_csv(&quot;csvFile.csv&quot;, sep=&quot;,&quot;, dtype=str, keep_default_na=False, encoding=&#039;utf-8&#039;)\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\">So the file is csvFile.csv, while we don&#8217;t have to declare it the sep provides the separator character in case of those pesky pipes. By declaring the dtype of str we&#8217;re saying the whole thing is a string so it doesn&#8217;t do odd tricks with numbers. The keep default na suppresses pythons overwhelming desire to put nan into anything that doesn&#8217;t seem like a proper value and of course always account for the encoding.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>One of the more important things I need to attend to is reading a CSV file and examining it. While there is a plethora of documentation on this, since this is my blog I&#8217;m documenting my most used cases. So the file is csvFile.csv, while we don&#8217;t have to declare it the sep provides the [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_crdt_document":"","_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[24],"tags":[],"class_list":["post-1049","post","type-post","status-publish","format-standard","hentry","category-python","author-aron"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4bBkH-gV","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/asberry.org\/blog_tech\/index.php?rest_route=\/wp\/v2\/posts\/1049","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/asberry.org\/blog_tech\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/asberry.org\/blog_tech\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/asberry.org\/blog_tech\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/asberry.org\/blog_tech\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1049"}],"version-history":[{"count":1,"href":"https:\/\/asberry.org\/blog_tech\/index.php?rest_route=\/wp\/v2\/posts\/1049\/revisions"}],"predecessor-version":[{"id":1050,"href":"https:\/\/asberry.org\/blog_tech\/index.php?rest_route=\/wp\/v2\/posts\/1049\/revisions\/1050"}],"wp:attachment":[{"href":"https:\/\/asberry.org\/blog_tech\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1049"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/asberry.org\/blog_tech\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1049"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/asberry.org\/blog_tech\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1049"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}