CSV import with PostgreSQL

Importing data is one of the tasks I like the least. It requires to write messy code and it will inevitably be really slow, maybe too slow to be viable.

A good approach in these cases is to rely as much as possible on the database engine using a temporary table to store the CSV content and then calling the various insert into select or update from select queries needed to import data from the temp table. They can definitely save your day.

Last day I stumbled upon this wonderful feature of PostgreSQL that could avoid even the need of parsing the CSV. So, let’s say we have the following database:

image

And the following CSV:


Code;Name;Description
SHO32_LEA01_BLA10;Black shoes;Black leather shoes
SHO32_LEA01_RED10;Red shoes;Red leather shoes
SHO32_PLA90_BLA10;Black shoes;Black plastic shoes
SHO32_PLA90_RED10;Red shoes;Red plastic shoes
HAT76_LEA01_BLA10;Black hat;Black leather hat
HAT76_LEA01_RED10;Red hat;Red leather hat
HAT76_PLA90_BLA10;Black hat;Black plastic hat
HAT76_PLA90_RED10;Red hat;Red plastic hat

we can write the entire import algorithm in the DB. Pseudocode:

  • import the csv inside a temporary table;
  • for each row, detect the correct model code, material code and color code splitting the code we have in the first column;
  • create all missing colors, materials and models;
  • update all existing products (setting the new name and description);
  • insert all new products;

To achieve this we can write a PostgreSQL function. Basically you can see a function like an aggregate of SQL statements. With a function we can store in the DB all the import logic and when we have to effectively import the data, we can simply execute the function.

Okay, let’s see a function that will load our CSV and will import the data (see the comments for info):

/* a function named import_products that requires an argument an returns nothing */
CREATE OR REPLACE FUNCTION import_products(file_path text) RETURNS VOID SECURITY DEFINER AS $BODY$ BEGIN DROP TABLE IF EXISTS tmp_import_data; /*
* create the temporary table
* we start inserting only not null columns
* and then we update the other columns to set references
*/ CREATE TEMP TABLE tmp_import_data( product_id integer, product_code varchar(255) NOT NULL, model_id integer, model_code varchar(5), material_id integer, material_code varchar(5), color_id integer, color_code varchar(5), name varchar(255) NOT NULL, description text NOT NULL ); /* copy the entire csv in tmp_import_data */ execute format ( $$copy tmp_import_data (product_code, name, description) from %L delimiter ';' header csv$$ , file_path); /*
* set model code, material code and color code individually
* splitting the product code
*/ UPDATE tmp_import_data SET model_code = split_part(product_code, '_', 1), material_code = split_part(product_code, '_', 2), color_code = split_part(product_code, '_', 3); /* insert new colors */ INSERT INTO models (code) SELECT DISTINCT model_code FROM tmp_import_data WHERE NOT EXISTS (SELECT id FROM models WHERE code = model_code);
/* set references */ UPDATE tmp_import_data SET model_id = s.id FROM (SELECT id, code FROM models) AS s WHERE s.code = model_code; /* insert new materials */ INSERT INTO materials (code) SELECT DISTINCT material_code FROM tmp_import_data WHERE NOT EXISTS (SELECT code FROM materials WHERE code = material_code);
/* set references */ UPDATE tmp_import_data SET material_id = s.id FROM (SELECT id, code FROM materials) AS s WHERE s.code = material_code; /* insert new colors */ INSERT INTO colors (code) SELECT DISTINCT color_code FROM tmp_import_data WHERE NOT EXISTS (SELECT code FROM colors WHERE code = color_code);
/* set references */ UPDATE tmp_import_data SET color_id = s.id FROM (SELECT id, code FROM colors) AS s WHERE s.code = color_code; /* update name and description for existing products */ UPDATE products SET name = s.name, description = s.description FROM (SELECT model_id, material_id, color_id, name, description FROM tmp_import_data) AS s WHERE products.model_id = s.model_id AND products.material_id = s.material_id AND products.color_id = s.color_id;
/* insert new products */ INSERT INTO products (model_id, material_id, color_id, name, description) SELECT DISTINCT t.model_id, t.material_id, t.color_id, t.name, t.description FROM tmp_import_data AS t WHERE NOT EXISTS ( SELECT id FROM products WHERE model_id = t.model_id AND material_id = t.material_id AND color_id = t.color_id ); END; $BODY$ LANGUAGE plpgsql;

Now we need only to call the function passing a file name, and the function will take care of loading the stuff:

SELECT import_products("/home/user/import.csv");

Easy interstital pages with RoR & meta tags

Last week I was asked to create a simple interstitial page. Basically, a page that a user would be redirected to between page navigation. Normally they are used for ads, but you can be more creative.

I came across a simple way to implement the interstitial page into our link shortener app using HTML meta tags. First, a redirect page was created, ‘redirect.html.erb’. Inside this page, I used the meta attribute ‘http-equiv' to force a refresh after 5 seconds, which along with the 'content' attribute would redirect to a given URL. For example:

<meta http-equiv="refresh" content="5;URL=http://mikamai.com" />
In order to direct this page to the user’s original URL, it was stored in a variable in the controller file, @url, in this case we will define it manually for readability:

@url = "http://mikamai.com"


Now, the URL can be dynamically changed, and we can swap out the static URL in the ‘redirect.html.erb' page for the @url variable:

<meta http-equiv="refresh" content="5;URL=<%= @url %>" />


The process of redirecting to the ‘redirect.html.erb' page is handled in the controller.


def follow
      respond_to do |format|
        format.html { render '/redirect' }  ## Renders the redirect.html.erb page
      end
 end


And thus, once a user goes to a shortened link, for example http://svel.to/1, an interstitial page will be rendered which after 5 seconds will redirect to http://mikamai.com.

I also added a simple countdown using pure JavaScript and the setTimeout function.:

var timer = 5,
countdownSpan = document.getElementById('countdown');

(function timeDown() {
    countdownSpan.innerHTML = timer--;
    if (timer >= 0) {
        setTimeout(function () {
            timeDown();
        }, 1000);
    }
}());

Facebook share buttons and history.pushState()

If you’ve ever tried to use Facebook social plugins' latest version in your website, you might have noticed how smooth and streamlined is now the embedding process.

Select a button type, include the SDK, copy-paste a snippet of code and… you’re done! You can also leave the data-href attribute blank, so the plugin will like/share the current page. Awesome!

At this point you might ask yourself, “Well, will this nice little button work with my super cool AJAX-powered app?”. The answer is not really.

The problem

Facebook social plugins won’t work properly when your application routing is based on Javascript’s history.pushState() API – this includes a large number of Rails applications that rely on Turbolinks for their navigation.

This is because the page metadata is parsed by Facebook SDK only when a full page load is triggered. So, if I navigate from example.com/page-one to example.com/page-two and try to click on my Facebook share button, it will prompt a share for /page-one.

The solution

How to solve this? It’s actually quite easy. A small snippet (here in CoffeeScript, but easily convertible in plain JavaScript) that will update our button and retrigger a parse will be enough:

updateFbButtonHref = (el) ->
  $fbButton = $(el).find('.fb-share-button')
  $fbButton.attr('data-href', window.location.href)

  FB.XFBML.parse()

Now, we can bind this function to the “page:load” event (in the case of a Turbolinks application), or generally after every pushState():


$(document).on 'page:load', ->
  updateFbButtonHref('.my-fb-button-container')

And we’re cool! Our button will now share the correct page.

Some ways to improve Sublime Text (3)

Sublime Text is a fast, extensive, cross-platform text editor which is supported by a large open-source community and a library of plugins & themes. Here are some of the ways I’ve improved my version of Sublime Text:

The most essential Sublime Text plugin is Package Control:



This plugin will help you acquire the rest of the plugs I’ll mention later, and it’s simple to install. If you’re using Sublime Text 3 like me, open the console (View > Show Console) and paste the following:
import urllib.request,os,hashlib; h = '7183a2d3e96f11eeadd761d777e62404' + 'e330c659d4bb41d3bdf022e94cab3cd0'; pf = 'Package Control.sublime-package'; ipp = sublime.installed_packages_path();urllib.request.install_opener( urllib.request.build_opener( urllib.request.ProxyHandler()) ); by = urllib.request.urlopen( 'http://sublime.wbond.net/' + pf.replace(' ', '%20')).read(); dh = hashlib.sha256(by).hexdigest(); print('Error validating download (got %s instead of %s), please try manual install' % (dh, h)) if dh != h else open(os.path.join( ipp, pf), 'wb' ).write(by)

If you’re on Sublime Text 2, use this code:
import urllib2,os,hashlib; h = '7183a2d3e96f11eeadd761d777e62404' + 'e330c659d4bb41d3bdf022e94cab3cd0'; pf = 'Package Control.sublime-package'; ipp = sublime.installed_packages_path(); os.makedirs( ipp ) if not os.path.exists(ipp) else None; urllib2.install_opener( urllib2.build_opener( urllib2.ProxyHandler()) ); by = urllib2.urlopen( 'http://sublime.wbond.net/' + pf.replace(' ', '%20')).read(); dh = hashlib.sha256(by).hexdigest(); open( os.path.join( ipp, pf), 'wb' ).write(by) if dh == h else None; print('Error validating download (got %s instead of %s), please try manual install' % (dh, h) if dh != h else 'Please restart Sublime Text to finish installation') 

The next plugin I recommend is this Git plugin. It’s fairly self-explanitory: it adds Git traits to the text editor. With Package Control, installation is simple: open Package Control with the “⌘+Shift+P” keystroke, type ‘Install Package’, and select ‘Git’.



Third plugin I can recommend is 'Sidebar Enhancements'. This can be installed similarly to the Git plugin. It adds a wealth of features to the bare default sidebar, of which can be seen below:



Fourth: Quick File Creator. This simple plugins adds the feature to create files with the command palette.



Fifth: SublimeLint — error highlighting.



Finally, Emmet - an incredibly useful plugin for any web developer. Emmet is an auto-completion engine for HTML and CSS code. Check out an example in this .gif:

Some useful keyboard shortcuts:


Show file in project: This shortcut needs to be added to Sublime, but it is one that I find myself constantly using. It will reveal the location of the current file in the sidebar. To map it, open your keymap file (⌘+Shift+P, search for ‘key bindings’), and add the following line:

	{ "keys": ["super+shift+r"], "command": "goto_symbol_in_project" }, 


Multiple cursors: Great for repetitive tasks. The keystroke is: 'CTRL+Shift+Arrow Key'. This will allow you to add as many cursors as you want!

Close the current HTML tag: a great keystroke for web-developers: '⌘+Alt+.'

Theming

There are thousands of themes for Sublime, so it’s really down to your preference. I’m using the Soda theme, but you can find many more here!

Look yourself in the mirror with Python and OpenCV

image

The starting point is rather easy:


#!/usr/bin/env python
import numpy as np
import cv2

cap = cv2.VideoCapture(0)
# Set camera resolution. The max resolution is webcam dependent
# so change it to a resolution that is both supported by your camera
# and compatible with your monitor
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 800)

# If you have problems running this code on MacOS X you probably have to reinstall opencv with
# qt backend because cocoa support seems to be broken:
#   brew reinstall opencv --HEAD --qith-qt
cv2.namedWindow('frame', cv2.WND_PROP_FULLSCREEN)
cv2.setWindowProperty('frame', cv2.WND_PROP_FULLSCREEN, cv2.WINDOW_FULLSCREEN)

while True:
    ret, frame = cap.read()
    cv2.imshow('frame', frame)
    if cv2.waitKey(1) == 27:
        break

cap.release()
cv2.destroyAllWindows()

With this simple code (adapted from the official doc), we have a fullscreen window showing the camera.

But we want the camera. So, first of all we have to transform this code to something more similar to a real app, written by a real dev.


#!/usr/bin/env python
import numpy as np
import cv2

class Mirror:
  def __init__(self):
    self.cam = cv2.VideoCapture(0)
    self.__setupCamera()
    # If you have problems running this code on MacOS X you probably have to reinstall opencv
    # with qt backend because cocoa support seems to be broken:
    #   brew reinstall opencv --HEAD --qith-qt
    self.__setupWindow()

  # Set camera resolution. The max resolution is webcam dependent
  # so change it to a resolution that is both supported by your camera
  # and compatible with your monitor
  def __setupCamera(self):
    self.cam.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
    self.cam.set(cv2.CAP_PROP_FRAME_HEIGHT, 800)

  def __setupWindow(self):
    cv2.namedWindow('frame', cv2.WND_PROP_FULLSCREEN)
    cv2.setWindowProperty('frame', cv2.WND_PROP_FULLSCREEN, cv2.WINDOW_FULLSCREEN)

  def update(self):
    ret, frame = self.cam.read()
    cv2.imshow('frame', frame)

  def release(self):
    self.cam.release()
    cv2.destroyAllWindows()

if __name__ == "__main__":
  mirror = Mirror()
  while True:
    mirror.update()
    if cv2.waitKey(1) == 27:
      break
  mirror.release()

Okay, NOW we can comment the code.

If we run the script directly, it will:

- create a new Mirror

- update the mirror infinitely until the key ESC is pressed

- dispose the Mirror

The Mirror constructor acquires the VideoCapture object referencing the webcam, sets the camera resolution (this is camera dependent) and sets the window to be fullscreen. You should change the resolution used in the code to use one supported by your camera, but be warned that greater resolution means lower fps.

To obtain a real mirror we have to flip the image horizontally adding in the Mirror class a new method responsible of getting the image and applying a filter on it:


  def read(self):
    ret, frame = self.cam.read()
    return cv2.flip(frame, 1)

  def update(self):
    cv2.imshow('frame', self.read())

And here it’s the mirror! We’re just stopping here, but why don’t you try add more stuff ? As you can see here, there are a lot of things OpenCV can do.