D3.1 Status report on the review of new data sources and methods

Summary
This deliverable is part of WP3 within the STORM project. The deliverable provides a literature review of the state-of-the-art and the current knowledge gaps for Big Data analytics, methods and algorithms in the freight sector. The deliverable also reviews relevant applications and concludes with a summary of challenges and opportunities of big data in the freight sector. The main attributes of Big Data include the “5V” concept consisting of volume, velocity, variety, veracity, and value. Big Data, however, remains big bulks of unstructured data that is of no real value unless it is converted into useful information. That is where big data analytics have a big role to apply advanced analytic techniques including data mining, statistical analysis, predictive analytics, etc. on big datasets as new business intelligence practice. It enables the analysis of huge amounts of complex data while harnessing traditional data and tools. It gives promises for exploring the hidden structures of each subpopulation of the data, which is traditionally not feasible and might even be treated as ‘outliers’ when the sample size is small. There are a variety of challenges hindering access and utilization of Big Data for freight transport applications due to its nature and unique characteristics, including, but not limited to, data collection, data ownership and accessibility, heterogeneity and standardization, storage, privacy and legal constraints, technical challenges and expertise, quality, validation and representativeness of the data. The lack of awareness or interest in data and data-driven decision-making by senior managers can be a major organizational challenge. Privacy issues often forbid the usage for purposes other than explicitly mentioned in the agreement or contract with the users. Likewise, to share data collected by companies with third parties, individual non-disclosure agreements need to be negotiated which forms a major obstacle in data collaboration and exchange. Finally, we identify few areas for further research including data collection and preparation, data analytics and utilization, and applications to support decision-making categories.