{"id":942,"date":"2022-03-01T10:40:30","date_gmt":"2022-03-01T16:40:30","guid":{"rendered":"https:\/\/asberry.org\/blog_tech\/?p=942"},"modified":"2022-03-03T00:19:26","modified_gmt":"2022-03-03T06:19:26","slug":"python-and-pandas-on-jupyter","status":"publish","type":"post","link":"https:\/\/asberry.org\/blog_tech\/?p=942","title":{"rendered":"Python and Pandas on Jupyter"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Maybe it should be in Jupyter??? In any case, I&#8217;ve been studying using python in jupyter notebooks and it&#8217;s some pretty radical stuff. Using numpy and %matplotlib inline can yield some incredible results. This is a list of the commonly used features and samples thereof.<\/p>\n\n\n\n<!--more-->\n\n\n\n<h2 class=\"wp-block-heading\" id=\"loading-dataset\">Loading Dataset<\/h2>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; auto-links: false; gutter: false; light: false; title: ; toolbar: true; notranslate\" title=\"\">\nimport pandas as pd\n\ndf18 = pd.read_csv(&#039;all_alpha_18.csv&#039;)\ndf18.head()\n<\/pre><\/div>\n\n\n<h2 class=\"wp-block-heading\" id=\"consise-summary-of-columns-and-rows-pandas-dataframe-info\">Consise summary of columns and rows &#8211; pandas.DataFrame.info<\/h2>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; auto-links: false; gutter: false; light: false; title: ; toolbar: true; notranslate\" title=\"\">\ndf18.info()\n<\/pre><\/div>\n\n\n<h2 class=\"wp-block-heading\" id=\"print-duplicate-lines-pandas-dataframe-duplicated\">Print duplicate lines &#8211; pandas.DataFrame.duplicated<\/h2>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; gutter: false; light: false; title: ; toolbar: true; notranslate\" title=\"\">\nduplicate = df08&#x5B;df08.duplicated()]\nprint(&quot;Duplicate Rows :&quot;)\nduplicate\n<\/pre><\/div>\n\n\n<h2 class=\"wp-block-heading\" id=\"count-duplicate-lines-pandas-dataframe-duplicated\">Count duplicate lines &#8211; pandas.DataFrame.duplicated<\/h2>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; gutter: false; light: false; title: ; toolbar: true; notranslate\" title=\"\">\nprint(df08.duplicated().sum())\n<\/pre><\/div>\n\n\n<h2 class=\"wp-block-heading\" id=\"print-count-lines-missing-data-pandas-dataframe-isnull\">Print\/Count lines missing data &#8211; pandas.DataFrame.isnull<\/h2>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; auto-links: false; gutter: false; light: false; title: ; toolbar: true; notranslate\" title=\"\">\nnull_data = df08&#x5B;df08.isnull().any(axis=1)]\nprint(null_data)\n<\/pre><\/div>\n\n\n<h2 class=\"wp-block-heading\" id=\"column-data-types-pandas-dataframe-dtypes\">Column data types &#8211; pandas.DataFrame.dtypes<\/h2>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; auto-links: false; gutter: false; light: false; title: ; toolbar: true; notranslate\" title=\"\">\ndf08.dtypes\n<\/pre><\/div>\n\n\n<h2 class=\"wp-block-heading\" id=\"distinct-values-in-columns-pandas-dataframe-unique\">Distinct values in columns &#8211; pandas.DataFrame.unique<\/h2>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; auto-links: false; gutter: false; light: false; title: ; toolbar: true; notranslate\" title=\"\">\nSmartWayColCnt08 = df08&#x5B;&#039;SmartWay&#039;].unique()&lt;br&gt;SmartWayColCnt08.size\n<\/pre><\/div>\n\n\n<h2 class=\"wp-block-heading\" id=\"dropping-columns-pandas-dataframe-drop\">Dropping columns &#8211; pandas.DataFrame.drop<\/h2>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; auto-links: false; gutter: false; light: false; title: ; toolbar: true; notranslate\" title=\"\">\ndf_08.drop(&#x5B;&#039;Stnd&#039;, &#039;Underhood ID&#039;, &#039;FE Calc Appr&#039;, &#039;Unadj Cmb MPG&#039;], axis=1, inplace=True)\ndf_18.drop(&#x5B;&#039;Stnd&#039;, &#039;Stnd Description&#039;, &#039;Underhood ID&#039;, &#039;Comb CO2&#039;], axis=1, inplace=True)\n<\/pre><\/div>\n\n\n<h2 class=\"wp-block-heading\" id=\"rename-columns-pandas-dataframe-rename\">Rename columns &#8211; pandas.DataFrame.rename<\/h2>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; auto-links: false; gutter: false; light: false; title: ; toolbar: true; notranslate\" title=\"\">\ndf_08.rename(columns={&#039;Sales Area&#039;: &#039;Cert Region&#039;})\n<\/pre><\/div>\n\n\n<h2 class=\"wp-block-heading\" id=\"replace-spaces-with-underscores-lowercase-labels\">Replace spaces with underscores, lowercase labels<\/h2>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; auto-links: false; gutter: false; light: false; title: ; toolbar: true; notranslate\" title=\"\">\ndf_08.rename(columns=lambda x: x.strip().lower().replace(&quot; &quot;, &quot;_&quot;), inplace=True)\n<\/pre><\/div>\n\n\n<h2 class=\"wp-block-heading\" id=\"confirm-column-lables-are-identical\">Confirm column lables are identical<\/h2>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; auto-links: false; gutter: false; light: false; title: ; toolbar: true; notranslate\" title=\"\">\ndf_08.columns == df_18.columns\n<\/pre><\/div>","protected":false},"excerpt":{"rendered":"<p>Maybe it should be in Jupyter??? In any case, I&#8217;ve been studying using python in jupyter notebooks and it&#8217;s some pretty radical stuff. Using numpy and %matplotlib inline can yield some incredible results. This is a list of the commonly used features and samples thereof.<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_crdt_document":"","_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[24],"tags":[],"class_list":["post-942","post","type-post","status-publish","format-standard","hentry","category-python","author-aron"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4bBkH-fc","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/asberry.org\/blog_tech\/index.php?rest_route=\/wp\/v2\/posts\/942","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/asberry.org\/blog_tech\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/asberry.org\/blog_tech\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/asberry.org\/blog_tech\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/asberry.org\/blog_tech\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=942"}],"version-history":[{"count":8,"href":"https:\/\/asberry.org\/blog_tech\/index.php?rest_route=\/wp\/v2\/posts\/942\/revisions"}],"predecessor-version":[{"id":953,"href":"https:\/\/asberry.org\/blog_tech\/index.php?rest_route=\/wp\/v2\/posts\/942\/revisions\/953"}],"wp:attachment":[{"href":"https:\/\/asberry.org\/blog_tech\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=942"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/asberry.org\/blog_tech\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=942"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/asberry.org\/blog_tech\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=942"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}