Excel/CSV/Tabular Data Extraction Tools
- Apache Any23 — Anything To Triples’ (any23) is a library, a web service and a command line tool that extracts structured data from Web documents. — From:Apache – Tags: RDF, RDFa, Microdata, Microformats, CSV
- CSV to API – Dynamically generate RESTful APIs from static CSVs. Provides JSON, XML, and HTML.
- Libre Information Batch Restructuring Engine – Open data conversion and API tool, created by the Office of the Chief Information Officer of the Commonwealth of Puerto Rico.
- [Datset] (http://ramblings.mcpher.com/Home/excelquirks/json): Anything jSon to Excel related, and library of Rest API/Excel integrations – Tags: jSon, Rest , Excel
- csv2rdf4lod automation: (aka “csv2rdf4lod”) csv2rdf4lod provides a quick and easy way to produce an RDF encoding of data available in CSV format. csv2rdf4lod also functions as a custom reasoner tailored for heavy-duty data integration. Although csv2rdf4lod can handle tabular data from well-structured RDBMS dumps, its forte is in handling “messier” tabular data created manually or using less rigorous information modeling strategies — perfect for handling real data that evolved ”in the wild”. In either case, csv2rdf4lod is designed to aggregate and integrate multiple versions of multiple datasets of multiple source organizations in an incremental and backward-compatible way. Strong emphasis on provenance. – From:Tim Lebo @TWC RPI – Tags: csv, RDF, linked data, data quality, reconciliation, transformation, enhancement, provenance, linking, workflow
- csv2xml: An XSLT for converting CSV to XML; _From: The National Archives – Tags: XML, CSV, TSV
- q: q allows performing SQL-like statements on tabular text data, including joins and subqueries; – Tags: CSV, TSV
- Google Refine (note that this will become Open Refine, soon): Allows to clean up, transform, and link data in tabular form — From:Google – Tags: cleaning, transformation, tabular data, linking, reconciliation, desktop tool
- MessyTables: Python library to cope well opening the various variants of CSV and Excel files. It is used by OpenSpending amongst other OKF projects.
- OpenLink Virtuoso Sponger: Existing Cartridges support transformation from CSV and other tabular formats, among many other targets, to RDF. More cartridges are always under development.
- RDF Refine: Google Refine extension for exporting RDF — From:DERI – Tags: RDF, linking, reconciliation, plug-in
- ScraperWiki: Collaborative routine scraping of websites and Excel files to create an API — From:ScraperWiki – Tags: HTML, CSV, Excel, API, scraping
- Tabels: Allows to clean up, transform, and link data, not only CSV, etc. but also PC-Axis, ESRI shapefile, etc. — From:CTIC – Tags: cleaning, transformation, tabular data, linking, reconciliation, online tool
- XLWrap: A spreadsheet-to-RDF wrapper, capable of transforming spreadsheets to arbitrary RDF graphs based on a mapping specification. It supports Microsoft Excel and OpenDocument spreadsheets such CSV/TSV files and it can load local files or download remote files via HTTP. — From:Andreas Langegger – Tags: RDF, Excel, CSV, TSV
- Mr. Data Convertor: Will convert your Excel data into one of several web-friendly formats, including HTML, JSON and XML. Tags: HTML, JSON, XML, Excel, MySQL, Ruby_
- Tarql: Small command-line tool for converting CSV to RDF, with a user-defined mapping expressed in standard SPARQL. From:Richard Cyganiak – Tags: RDF, CSV, SPARQL
Analysis / Data Mining Tools
- [R] (http://www.r-project.org/): focused on statistical analysis, but many packages available.
- [Weka] (http://www.cs.waikato.ac.nz/ml/weka/)
See list inData Wrangling Handbook
- ckan: Has some visualisation, particularly around geolocation data.
- TileMill/Mapbox: Interactive map builder.
- Google Visualisation Tools
- OpenLink Data Explorer (ODE)
- Pivot Viewer
- [R] (http://www.r-project.org/)
- Odyssey: combining maps and storytelling
- CartoDB: maps from data
- Datawrapper: outstanding open source tool for simple charts on the web.
- Sextant: a web-based and mobile ready platform for visualizing, exploring and interacting with time-evolving linked geospatial data
- RAWGraphs is an open web tool to create custom vector-based visualizations
- SDMX Converter The SDMX Converter is a tool that converts statistical datasets between different formats. It is a Java application which is actively developed by Eurostat and is published as open source software.
- IMF SDMX Central The IMF SDMX Central offers a free data conversion service for excel datasets. Validation conducted prior to conversion ensures the datasets have a suitable structure to convert to SDMX. In addition to Excel to SDMX, SDMX Central can convert a dataset from CSV to SDMX, and can convert SDMX files to either Excel or CSV.