Skip to content

amolvenkataraman/HawkHacks2024

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

74 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DataSrc

The rise of LLMs has created a new market for high-quality training data. Data needs to be annotated before it is usable for training AI. Data annotators either work full-time (getting a very high pay for what they do) or they volunteer their time (getting no pay). Neither of these is a good solution. There is no middle ground (a place where people can contribute to annotations as a side gig).

Introducing DataSrc

DataSrc is a decentralized platform for data annotation .

Anybody can upload data for annotation, and anybody can annotate. Payments are handled with NEAR and Avalanche.

How it works

When uploading data (currently limited to images) for annotation, the uploader pays for the annotation and the job is added to the MongoDB database.

When annotating, a random file (image) is loaded and displayed to the annotator. Once a smart contract message approves the annotation, it is stored on the blockchain as an NFT, and the annotator is paid through cryptocurrency.

Next steps

  • Get the project ready to deploy (connect to mainnet instead of testnet).
  • Work with existing data annotators to understand what tools and features could help them with their job.
  • Reach out to different AI companies about the potential to use DataSrc to outsource data annotations.
  • Improve the API to allow companies to connect DataSrc directly to their existing tech stack (through the API).
  • Improved annotation logic (getting annotations from multiple people and inferring the “correct” annotation).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •