1601 E. 5th St. #109

Austin, Texas 78702

United States


Module 002/2, Ground Floor, Tidel Park

Elcosez, Aerodome Post

Coimbatore, Tamil Nadu 641014 India


138G Grays Hill

Opp. BSNL GM Office, Sims Park

Coonoor, Tamil Nadu 643101 India


Block 7, Lot 5,

Camella Homes Bermuda,

Phase 2B, Brgy. Banlic,

City of Cabuyao, Laguna,


San Jose

Escazu Village

Calle 118B, San Rafael

San Jose, SJ 10203

Costa Rica

News & Insights

News & Insights

Redesigning the Data Supply Chain

In the 1990s, large manufacturers spent millions on software to streamline their supply chains, and they continue to invest in refining processes to make those supply chains as efficient as possible. For those companies whose business is the manufacturing and sale of information, the redesign of “data supply chains” came more slowly, but it is now in high gear.

A typical data supply chain has five parts:

  1. Collection
  2. Cleanup
  3. Appending/enrichment/overlay
  4. Storage
  5. Delivery

Each of these stages typically involves both human and machine resources and a cluster of custom, off-the-shelf, SaaS, and open source technologies. Tools, almost always custom-built, handle the hand-offs between the stages and shape the design of all the interconnected parts of the process in order to:

  • Push the average age of the data to near real-time
  • Allow flexibility (i.e., easy integration with third parties)
  • Enable rapid, cost-effective scalability.

The current best practices for each stage of the data supply chain are:

  1. Collection:
    • Reduced dependence on telephone verification
    • Self-updating mechanisms
    • Managed crowdsourcing
    • Built-in data validation to keep out “bad” data
    • Outsourced collection teams
    • Real-time data acquisition directly from primary sources (government filings, news articles, social chatter, etc.).
  2. Clean-up: ETL routines and many other tools for normalization and standardization.
  3. Appending/enrichment/overlay: Increased reliance on associating database records with related information rather than “hard-wiring” the related information into the database.
  4. Storage: The cloud.
  5. Delivery: Outputs have also multiplied to include iOS, Android, and Windows 8.

As these new supply chains are deployed and refined, the industry gets ever closer to the promised land of fully automated processes to gather and deliver data.

Keep on top of the information industry 
with our ‘Data Content Best Practices’ newsletter:

Keep on top of the information industry with our ‘Data Content Best Practices’ newsletter: