I’m stuck in Tableau land and needed a break. My breaks are partial to figure out if I’ve gone crazy or I cannot use the data I have. Nowadays, a great deal of data is unstructured which makes it hard to analyze. Data science helps to analyze and understand the data to predict the future and also make an informed decision based on this knowledge.
Many aspiring data scientists want to practice frequently to achieve finesse at their art. But often they do not know what is are the best dataset sources to practice on. If you are looking for the right sources to practice data science, you have reached the right place. Here are a few best sources we have gathered for you!
Knoema
This free open data source platform is literally overflown with the data as it offers you more than 2.8 billion time series data on as much as 1000 topics. The data covers several niches from agriculture to transportation – all gathered from several renowned sources; Facebook, Amazon, Google, WHO, and etc. Whoever is interested in practicing data science, or machine learning, or statistics can use Knoema.
Kaggle
Kaggle is the ultimate famous data source platform among data scientists. The best part is that the data is available with some pre-processing already done. This is a comprehensive source where you can find data on anything. You can even avail of data sets as large as 2TB. You just have to type in the search box, what type of data you’re looking for – and you can do a quick search by using the filter to tell the site size and type of data.
Google Custom Dataset Search
Google Custom Dataset Search has been serving as a perfect data set repository since 2018. It is too easy to search data within seconds utilizing this platform. This incredible platform harbors data about everything ranging from plants and animals to UFOs and even more. Using the right keywords, format, creator-info, and dataset name, you will instantly get the relevant results. Another exciting bit about Google Custom Dataset Search is that you can search datasets in markup languages as well. This platform is a sort of unified source of various dataset repositories.
Data.gov
This is a US government-governed data portal where you will find more than 248,783 datasets. This platform also covers data from various arenas of life. Here you will find data gathered from numerous US agencies. But sometimes it’s difficult to search for the right set of data as you can’t figure out the version of the data. However, the best bit is that the data is publicly available and open to all for downloading.
VisualData
This platform particularly deals with image datasets. Here you will find more than 334 datasets on several topics. Animals, plants, robots, 3D constructions are a few to name. These images are accumulated from hobbyists and researchers. You are required to use the right keywords to get to the data of your requirement.
The aforementioned sources are the best skim but certainly not the last option. There are several other data source platforms across the internet. Such as, GitHub and Reddit are also promising data sources to use for learning purposes.