Power of Eloquence

Working with CSV file using NodeJS

| Comments

I worked with CSV data some years back and I’m always curious to know how well NodeJS handles CSV filesystem in the backend compare with the likes of Java, .NET, Ruby or PHP environments.

As it turns out, I figured you can do things with it in the smallest possible way, by importing the following modules.

let fs = require('fs');
let fastcsv = require('fast-csv');

We have the fast-csv module to handle the CSV data(especially if it’s fairly large dataset thus the chosen module has to be performant-driven) when reading it from the FileStream input like so.

let readableStreamInput = fs.createReadStream('./some-csv-table.csv');
let csvData = [];

fastcsv
    .fromStream(readableStreamInput, {headers: true})
    .on('data', (data) => {
      let rowData = {};

      Object.keys(data).forEach(current_key => {
         rowData[current_key] = data[current_key]
      });

      csvData.push(rowData);

    }).on('end', () => {
      console.log('csvData', csvData);
      console.log('total rows of table', csvData.length);
    })

In here, what we’re saying is as follows:

  • First, we create and open up some ReadableStream object readableStreamInput on our CSV file so we can perform some type of file manipulation within the filesystem.
  • We then use npm module fastcsv to read up on our readableStreamInput in fromStream method. In the same method, I want to include headers in the dataset because I want to reference each row fields’ values to their corresponding field names when parsing.
  • Then within our data callback handler, our fastcsv module will look at each line of the CSV input stream so we can decide what to do with them. Normally, we can perform a number of interesting operations here, especially when refining and parsing CSV input data. In this case, I’m creating new rowData object for each new line that comes through. At the same time, I want to capture each row’s field data and store them individually as object properties by using its respective field name current_key.
  • Once I’m done parsing all the column fields in the row, I add the same row instance to my csvData array.
  • And then repeat the process for the rest of the ‘rows’ in the CSV table file.

What’s cool with this is you can immediately serve up this parsed CSV data on the front end as JSON format without adding other web server middleware to be configured in backend - just by adding just a few lines of code.

let http = require('http');

let server = http.createServer((req, resp) => {
  resp.writeHead(200, {'content-type': 'application/json'});
  resp.end(JSON.stringify(csvData));
})

server.listen(5050);
console.log('Server listening on port: 5050');

And that’s it!

With this, it leaves room for other possibilities in your web stack application such as encapsulating your JSON data representation into some ORM model such as non-relational databases ie MongoDB, MySQL, Cassandra as well as relational ones ie MySQL, Postgres etc. And then design your RESTFUL endpoints layer on top of it. Thus you will end up with a perhaps, unsophisticated, but light-weight architecture you can start with and scale your web app without much friction accordingly.

Using libraries inside NodeJS ecosystem certainly takes care all of the nitty gritty complicated stuff on streaming, fetching and serializing data online for users to consume. Their rich packages certainly offer plenty of useful and, often, simpler abstractions to make you write reliable code.

Happy Coding!

Comments