Python

Python – Merge CSV’s

I’m really enjoying python, one of the things that I’m really digging is pandas, this piece lets you work with CSV files to do a multitude of things. Because you have to pull out data in a loop, this little piece of code will allow you to stitch them together. This sample uses pipe delimiters with UNIX line feeds and quoted all.

import os
import glob
import pandas as pd
import csv
#os.chdir("C:\\onedrive_tt\\Testing\python\\CombineCSVs\\testApplicationFiles")
os.chdir("C:\\app\\python\\CombineCSVs\\testApplicationFiles")

extension = 'csv'
all_filenames = [i for i in glob.glob('*.{}'.format(extension))]

#combine all files in the list
combined_csv = pd.concat([pd.read_csv(f, sep = '|', dtype=str) for f in all_filenames ])
#export to csv
combined_csv.to_csv( "combined_csv.csv", sep='|', index=False, encoding='utf-8', line_terminator="\n", quoting=csv.QUOTE_ALL)

SSIS

SSIS – ForEach looping import files

One of the harder things to learn in SSIS is the looping through files and importing them into the DB to be able to work with them. Almost all integrations produce files as [what I am]-[datetime].csv. For example recruiting-Application_Attachments-20211012 00558 is the manifest file extracting the recruiting application attachments and the time-date at the back let’s us know when.

Hopefully the export can take place using one file but don’t count on it, you have to apply looping throughout the whole process, from whatever is doing the extract, through the transfer and onto the load. So if you have a dozen files like above (or 100, 1000, 10000) you need to loop through and transfer.

This walk through from stack overflow is dead on and should get you started in the right direction.

PowerShell

Powershell – Deleting Empty CSV Files

Sometimes in integration, you have jobs that run every 5 minutes or so, now these jobs run regardless if there’s data or not and they’ll produce an ’empty’ file. That’s to say there isn’t any usable data in it, just a header row. So you have one file every 5 minutes, that’s 12 an hour or 288 in a day. Some are going to have date, others not. While this can be done with a batch file (elsewhere on this site), Powershell offers a quicker way to do the same thing. Below is a Powershell script that looks through CSV’s and kills ones with only one line of data (header) leaving anything with real data in place.

$path="C:\onedrive_tt\Client\TCC_Touchpoints-Maintenance\scripts\20201207-CleanupTest\response\"
Get-ChildItem $path -Filter *.csv |
ForEach-Object {
$content = $path+"$_"
[int]$LinesInFile = 0
$reader = New-Object IO.StreamReader $content
 while($reader.ReadLine() -ne $null){ $LinesInFile++ }
$reader.Close()
#Write-Host($content+" - "+$LinesInFile)
if ($LinesInFile -eq 1)
{
#Write-Host("File: "+$content+" with "+$LinesInFile+" line will be deleted")
Remove-Item -Path $content -Force
}
}
PowerShell

TCC & Powershell – CSV to Set Filters

If you’ve been working with TCC for any length of time you’ll know the purple squirrel, rainbow unicorn and pot at the end of the rainbow is being able to set the filters based on a CSV. This ability would mark a boon for automating ad hoc integrations.

Now there are options, Python could pull it off but Python doesn’t live and play on a stock server I run across 95% of the time. Powershell does however, well it usually lives there, not sure how much play is involved.

In any case, it turns out we can do this whole process in Powershell without a whole bunch of headache.

#Declare variables
$sourcePath = "..\..\scripts\mappingFiles\mappingContestNumberToJobCode.csv"
$targetPath = "..\..\scripts\mappingFiles\mappingFilterList.txt"
$targetContent = @()
$outputContent = New-Object System.Text.UTF8Encoding $false
#Loop through source file to build filter
Import-Csv $sourcePath | ForEach-Object {
    $targetContent += "<quer:string>$($_.Requisition)</quer:string>"
}
#Write out file as UTF8 no BOM
[System.IO.File]::WriteAllLines($targetPath, $targetContent, $outputContent)
#Update export specificaitons with the new filter set
((Get-Content -path ..\..\scripts\RequisitionExport\RequisitionExportTemplate_sq.xml -Raw) -replace 'FilterList', (Get-Content -path  ..\..\scripts\mappingFiles\mappingFilterList.txt -Raw)) | Set-Content -Path ..\..\scripts\RequisitionExport\RequisitionExport_sq.xml;
((Get-Content -path ..\..\scripts\RequisitionMappingExport\RequisitionMappingExportTemplate_sq.xml -Raw) -replace 'FilterList', (Get-Content -path  ..\..\scripts\mappingFiles\mappingFilterList.txt -Raw)) | Set-Content -Path ..\..\scripts\RequisitionMappingExport\RequisitionMappingExport_sq.xml;
((Get-Content -path ..\..\scripts\RequisitionCollaboratorExport\RequisitionCollaboratorExportTemplate_sq.xml -Raw) -replace 'FilterList', (Get-Content -path  ..\..\scripts\mappingFiles\mappingFilterList.txt -Raw)) | Set-Content -Path ..\..\scripts\RequisitionCollaboratorExport\RequisitionCollaboratorExport_sq.xml;
((Get-Content -path ..\..\scripts\RequisitionOpenExport\RequisitionOpenExportTemplate_sq.xml -Raw) -replace 'FilterList', (Get-Content -path  ..\..\scripts\mappingFiles\mappingFilterList.txt -Raw)) | Set-Content -Path ..\..\scripts\RequisitionOpenExport\RequisitionOpenExport_sq.xml;
((Get-Content -path ..\..\scripts\SourcingRequestExport\SourcingRequestExportTemplate_sq.xml -Raw) -replace 'FilterList', (Get-Content -path  ..\..\scripts\mappingFiles\mappingFilterList.txt -Raw)) | Set-Content -Path ..\..\scripts\SourcingRequestExport\SourcingRequestExport_sq.xml;
((Get-Content -path ..\..\scripts\CandidateMatchExport\CandidateMatchExportTemplate_sq.xml -Raw) -replace 'FilterList', (Get-Content -path  ..\..\scripts\mappingFiles\mappingFilterList.txt -Raw)) | Set-Content -Path ..\..\scripts\CandidateMatchExport\CandidateMatchExport_sq.xml;
((Get-Content -path ..\..\scripts\RequisitionQuestionExport\RequisitionQuestionExportTemplate_sq.xml -Raw) -replace 'FilterList', (Get-Content -path  ..\..\scripts\mappingFiles\mappingFilterList.txt -Raw)) | Set-Content -Path ..\..\scripts\RequisitionQuestionExport\RequisitionQuestionExport_sq.xml;
((Get-Content -path ..\..\scripts\RequisitionOtherLocationExport\RequisitionOtherLocationExportTemplate_sq.xml -Raw) -replace 'FilterList', (Get-Content -path  ..\..\scripts\mappingFiles\mappingFilterList.txt -Raw)) | Set-Content -Path ..\..\scripts\RequisitionOtherLocationExport\RequisitionOtherLocationExport_sq.xml;

Python

Python – Reading CSV as input

One of the more fascinating aspects of python is it’s file manipulation capabilities. I have from time to time needed to process (create) 160 or so new batch files to call warehouse integrations. With proper naming convention in development, this process is handled with 21 lines of python code. Love it!!!

with open('_RecruitingNewExport.csv') as file:
  for line in file:
    line = line.replace("\n", "")
    outputLineTwo = line[:-6]
    with open("Recruiting_"+str(line)+".bat", "w") as file:
        file.write("REM This batch file executes the "+str(line)+" integration\n")        
        file.write("\n")
        file.write("REM Set directories and variables\n")
        file.write("@ECHO OFF\n")
        file.write("cd /d \"%~dp0\"\n")
        file.write("Call Environment.bat\n")
        file.write("TITLE %~nx0 – %TALEO_HOST% – %date% %time%\n")
        file.write("SET timeStamp=%date:~10,4%-%date:~4,2%-%date:~7,2%_%time:~0,2%-%time:~3,2%-%time:~6,2%\n")
        file.write("SET timeStamp=%timestamp: =0%\n")
        file.write("\n")
        file.write("REM Run the export integration\n")
        file.write("Call core\TCC.bat \"%SCRIPTS_FOLDER%\\CandidateExportScripts\\"+str(line)+"\\"+str(line)+"_cfg.xml\" \"%SCRIPTS_FOLDER%\\CandidateExportScripts\\"\
                   +str(line)+"\\"+str(line)+"_sq.xml\" \"%OUTBOUND_FOLDER%\\"+str(outputLineTwo)+"_%timestamp%.csv\"\n")
        file.write("Exit /B %ERRORLEVEL%\n")
        file.write("\n")
  print()


Taleo Web Services

Taleo SSO Metadata URL

I can’t for the life of me keep track of this so here it is so I know where to look for it.

https://ZONE.taleo.net/smartorg/sp/metadata.jss?target={ENTITY-ID}

https://ZONE.taleo.net/careersection/sp/metadata.jss?target={ENTITY-ID}

Batch

File – Data or No Data

A lot of the work we do for integration is of course automated. Sometimes we have to extract a data set and then use that export as the import to another integration. This is fine and well if there is data but if you have to run it every, let’s say, 5 minutes, there may very well be a situation where there isn’t anything to pick up. The matter is complicated by the fact that even if there isn’t any information there will be a header row.
This file is now completely useless and if left in place it will run through the process and wind up in the archive and produce a results file meaning you have to dig through empty files to try to find what’s going on.
To fix this I came across a batch file that will check the file and delete it if it has only a header row, a further check makes sure that that if any data is there is ignores the file. For your consideration and my reference here is the file that does this.

@echo off
setlocal enableExtensions disableDelayedExpansion

for %%a in ("..\..\data\inbound\*.csv") do (
   call :checkFile "%%~a"
   if errorlevel 1 del %%~a 
   echo File contains no content
)
endlocal
goto :eof


:checkFile
:: %~1   name of the file to check
:: returns 0 if file contains data, else 1 (== header only).

:: if file contains more than one line: data found
for /F "tokens=2 delims=:" %%a in ('find /c /v "" "%~1"') do (
   for /F "delims= " %%b in ("%%~a") do (
      if not 1 == %%~b exit /b 0
   )
)

:: if only line does not start with "Ident", then line contains data
for /F %%a in ('findstr /v "^Ident" "%~1"') do (
   exit /b 0
)

:: header only
exit /b 1
PowerShell

PowerShell – Replace in Files

One of the hard parts about working on remote servers is they don’t usually have the proper toolkit to fix things as needed, meaning you have to make the changes in the development workstation, repackage and post, pull, copy, extract and paste in order to get anything done. Fine if it’s the initial load but when you only need to change something in a dozen files or so it can be tedious. So in the spirit of making due with what you have, you can use PowerShell to emulate the replace in file normally invoked from Notepad++ to work around it.
Here is the example:

foreach ($file in Get-ChildItem FileMask*.bat)
{
powershell -Command "(gc $file) -replace 'Text1', 'Text2' | Out-File -encoding UTF8 $file"
}
MySQL

MySQL – New Database > New User

I’m trying to get a little more granular on DB security in Linux and to that end used the MariaDB root user for all LAMP installations is probably not the brightest idea. I’ll try to commit this to memory but until then I’ll follow these steps to create a new database and then a new user with full access to that DB thereby isolating the databases from each other so one breach doesn’t take down the system.

mysql -u root -p

CREATE DATABASE newDatabase;

GRANT ALL PRIVILEGES ON newDatabase.* TO 'newUser'@'localhost' IDENTIFIED BY 'newPassword' WITH GRANT OPTION;

FLUSH PRIVILEGES;

Python

Python – CSV to Pipe delimiter

So why would a perfectly sane person want to have a pipe delimited CSV verses a properly escaped comma separated value file? Yeah beats me too, glutton for punishment, not working with the right parser, drug addiction? All could be reasons. But fear not, if this is the road you’d like to go down there’s a python script for it and here it is so I don’t have to recreate the wheel next time I need it.

import glob
import csv
import os

for entry in glob.glob('applicationResumeAttachmentsManifest/*.csv'):
    #outputFile = (entry.strip(".csv")+"-Pipe.csv")
    outputFile = ("C:\\python\\pipe\\"+entry.strip(".csv")+"-Pipe.csv")
    os.makedirs(os.path.dirname(outputFile), exist_ok=True)
    #print(entry)
    #print(outputFile)
    with open(entry, encoding='utf-8') as inputFile:
        with open(outputFile, 'w', encoding='utf-8', newline='') as writeFile:
            reader = csv.DictReader(inputFile, delimiter=',')
            writer = csv.DictWriter(writeFile, reader.fieldnames, delimiter='|')
            writer.writeheader()
            writer.writerows(reader)
print("Conversion complete.")