Python

Python – Read CSV

One of the more important things I need to attend to is reading a CSV file and examining it. While there is a plethora of documentation on this, since this is my blog I’m documenting my most used cases.

dfOriginalCSV = pd.read_csv("csvFile.csv", sep=",", dtype=str, keep_default_na=False, encoding='utf-8')

So the file is csvFile.csv, while we don’t have to declare it the sep provides the separator character in case of those pesky pipes. By declaring the dtype of str we’re saying the whole thing is a string so it doesn’t do odd tricks with numbers. The keep default na suppresses pythons overwhelming desire to put nan into anything that doesn’t seem like a proper value and of course always account for the encoding.

PowerShell

PowerShell – Count number of commas in each row of a CSV

So sometimes your data is so bad that you can’t trust a two column CSV to be turned into a dictionary, or more aptly, the file that is supposed to be two columns won’t open with in python to convert it to a dictionary because it doesn’t see two columns. So where is the issue? Probably a column count, which you can usually figure out if there are extra in excel but what if there are less? This PS snippet will read a given CSV, count the number of columns in the header row and report on any difference in the file, assuming your header row is correct this tells you which lines aren’t helping to validate a file.

$data = Get-Content location.csv

#Count in header
$header = $data[0].Split(",").Count
#Line counter
$i = 1

#Find lines with less or more commas
$data | ForEach-Object { 
    $c = $_.Split(",").Count
    if($c -ne $header) { "Line $i - $c commas" }
    $i++
}
Python

Python – FutureWarning

Being new to python and running it in the Jupyter notebooks, sometimes you get errors that just don’t make sense and it’s a bit frustrating when you can’t make sense of the error. Let’s take this gem:

/home/aron/anaconda3/lib/python3.9/site-packages/IPython/core/interactiveshell.py:3397: FutureWarning: In a future version of pandas all arguments of read_csv except for the argument 'filepath_or_buffer' will be keyword-only.
  exec(code_obj, self.user_global_ns, self.user_ns)

So what the hell does that mean? I understand pandas, arguments, and what read_csv is but what is keyword-only???

It turns out that this keyword only means that instead of relying on the order of the argument, you simply need to use the keyword for anything except for filepath or buffer…

So this piece of code throws the error:

# Load in the general demographics data.
azdias = pd.read_csv('Udacity_AZDIAS_Subset.csv', ';')

And this is the error fixed:

# Load in the general demographics data.
azdias = pd.read_csv('Udacity_AZDIAS_Subset.csv', delimiter=';')
Python

Python – Dataframe – Unique Value in Column

This is the select distinct to get the individual values of a column:

dataFrame.column.unique()
Batch

Mute/Unmute System Volume

I work from home, so I have an office (room set aside as one) and I try to keep office hours but one of the things that no matter how hard I try, I fail to turn the volume on in the morning so I get notifications for meetings, email and IM’s and turn it off when leaving so I’m not hearing the notification for meetings, email and IM’s. You’d think this is fairly simple, you’d be thinking wrong.
I found a way to pass the pressing of the mute button and was able to schedule it for 8 am and 5 pm. Not elegant because if the mute is on at 5 pm it gets un-muted. It’ll work for now until I can look deeper into it.

Create a batch file that you’ll call with task scheduler. It’s one line:

powershell (new-object -com wscript.shell).SendKeys([char]173)
TCC

TCC – Taleo Connect Client Content Migrated

Hi and thanks for dropping by! I’m excited to announce that I’ve migrated the TCC content that was hosted here to the ThinkTalent.US website! With the loss of the TCC SIG, there’s a vacuum related to TCC information and community and I trust ThinkTalent to be the curator of this arcane knowledge moving forward. Looking for the best of the pack Oracle Taleo support? Drop ThinkTalent a line and they’ll be sure to get you pointed in the right direction!

Education

Command-Line in Current Directory

My friend and work cohort David Miller keyed me into this last year and it’s been one of the most helpful tips I’ve gotten in years. Need to drop to a command line for the directory you’re currently browsing in Windows explorer? Simply type in cmd in the address bar and you’ve got a command window homed to that directory!!!

Python

Python and Pandas on Jupyter

Maybe it should be in Jupyter??? In any case, I’ve been studying using python in jupyter notebooks and it’s some pretty radical stuff. Using numpy and %matplotlib inline can yield some incredible results. This is a list of the commonly used features and samples thereof.

read more »
Linux

Server Changeover Complete

Apologies for the site being offline from 2/13 – 2/18. The storage array needed to be changed over.
Short story: Hard drive crashed.
Long story: Back in 5/20, I found WD SAS 3TB drives on newegg for like $50 apiece and got a dell H310 raid card for $25 and set my Linux server up with 4 of those bad boys in a RAID 5 array. Turns out the drives were used which brings me to the moral of the story never buy from TDT Technologies on NewEgg.com. So I had one drive fail about 6 months ago and I knew it was a matter of time before another was going to die so I ordered 4 new Seagate 4TB SATA drives and the cables to hook into the H310 and went about my merry way until Sunday 2/13 when I lost the 2nd SAS drive and all was lost. Well not really as the system was rsynced with the NAS so no biggie, I had the data.
Now I’ve been with Fedora since Fedora was spun off from RedHat a billion or so years ago in computer time. I love Fedora and my last build I was tempted to move to CentOS but couldn’t get it to run with the hardware so Fedora to the rescue and Fedora it was. But now thanks to the CentOS debacle RedHat is now giving RHEL subscriptions to folks like me that only use one server and don’t make any money off of. So now I’ve got just under 11TB worth of space on my array with new drives in a RAID 5 configuration running Red Hat Enterprise Linux 8.5!!! This is a huge win because I wanted to get certified in RHEL and now I’ve got a box to work with and it’s long term support. And I’ve learned a bunch to be able to convert my Fedora rsync backups to the RHEL OS.

Linux

Certbot Adding Domain to a Cert

I self-host my websites because, well I’m a computer geek and that’s what we do. Back in the dark ages, we used to pay out the nose for SSL certificates to protect the site content in transit. To this day I won’t deal with GoDaddy because of an issue with that. But then LetsEncrypt was formed by the industry heavyweights and offered free SSL certs for your self-hosting needs.

As I like registering domains on whims, I need to secure them when I bring them up and here’s what you need to do.

First, look up the certificates that you have with:
certbot certificates

Then add the domain you want to at the end of the list, in this case, domain4.com:
certbot certonly –cert-name [CERTNAME] -d domain1.com,domain2.com,domain3.com,domain4.com