Buy, sell, and share

Data is the world's most valuable resource.
Now you can trade data on our
revolutionary marketplace.

How Data & Sons Works

Upload any dataset onto our marketplace. Almost every spreadsheet, table, or list is valuable. Take a few minutes to describe your dataset, price it, and post it!

Your dataset is then listed on the Data & Sons market. Buyers can easily find your dataset and buy it in one click.

Once your dataset sells, we transfer your money to you instantly. Best of all, you can sell your dataset multiple times to any interested buyer.


At Data & Sons believe all people are entitled to their privacy and respect individual rights to protect personal information. We pledge not to allow the sale of personal information on our data marketplace. We safeguard personal privacy by reviewing all data before it is listed for sale on Data & Sons.



By Sean Lux

With data scientists in high demand, lots of training programs have started up to help people learn the skills necessary to enter the profession. I’m guessing many of you may have made a New Year’s resolution to learn/improve your data science skills, and thought a post identifying the numerous programs would be helpful. I spent a fair amount of time in 2017 researching the various data science training programs available and have categorized these programs by price, payment structure (flat fee, subscription, per class), and curriculum (structured vs ala carte). Please keep in mind I am not ranking these training programs, just categorizing them. For the complete list with basic details and pricing of each program, check out this dataset. Data Science Training Programs  Gateway Drugs: Free, ala Carte Courses These programs allow you to select and take free data science courses. Kaggle’s kernels are very useful for picking up specific data science skills, whereas Data 36 provides more generalized courses. Both provide an easy way to get a basic familiarization with data science skills and Kaggle is great for picking up specific skills. Udacity actually uses their courses as a gateway to their paid nanodegree programs and you can take quite a bit of the content for free. Enter the Dojo: Free Training Programs with Structured Curriculum Two variations of the free program with structured curriculum are available. The first I’ll cover is general introduction to data science programs. These include Allison, Future Learn, edX, and Cognitive Class. All provide a structured sequence of courses you can take to learn various aspects of statistical analysis programming. Future Learn and edX charge for certifying completion of their programs, but otherwise is free. The second type is programs like Insight and Data Incubator target, which are for science Ph.D.s transitioning from their fields into data science. Both offer highly intensive data science boot camps to build on the statistical skills acquired in a Ph.D. program. Both programs then serve as recruiters placing their students with companies. They make their money on the program from the recruiter fees (remember if its free, you’re the product). Cool win/win idea for recovering academics. Training Buffets: Pay per Course Several programs allow you to select data science training courses and pay per course. These programs often have a large catalog of courses and are reasonably priced ($7 - $10 per course). Data Oragami, Udemy, and some Udacity content are available on a per course basis. Good way to try a few classes to see if data science is right for you or pick up some specific skills. Dataversity also follows this model, but at a much higher price point ($79 - $129 per course). District Data Labs also offers a premium offering focused on corporate training ($25 per participant). Subscribe Now! Structured Curriculum with Paid Subscription These data science training programs provide a structured curriculum with multiple learning paths for a low monthly rate. Coursera, Data Camp, Dataquest, Lynda, and O’Reilly all follow this model and offer monthly subscriptions from $25 to $50 a month. These programs incentivize you to hustle since you can learn at your own pace and the faster you move through the content, the less you pay. Data Camp and Dataquest appear to be some of the more popular in the data science community. Old School: Structured Curriculum with Upfront Tuition Not surprisingly, this old school model is the most prevalent way to offer data science training. Udacity is perhaps the king of this space offering the best value with good content in a structured course sequence ($499 - $699). The price goes up from there from $699 (Simplilearn Data Science) to $8500 (Thinkful). Brain Station, General Assembly, K2 Data Science, Springboard, and The Institute for Statistics Education all fall somewhere between these two price points. These programs typically differentiate themselves from the Subscribe Now! programs by offering one-on-one mentors that can help you through the program. If the one-on-one matters to you, this is the way to go. Otherwise, I did not see a lot of difference between the tuition and subscription based models. In fact, Springboard actually uses Data Camp’s content. I do not have university courses identified here. If you’ve got plenty of time and money to burn, check out this dataset listing all university data science programs. If I’ve missed a program or there's any inaccuracies, please shoot me an email.

By Sean Lux

December 2017 Newsletter Happy New Year! Business aside, Greg and I want to thank everyone that has supported us through 2017 and on into 2018. We especially want to thank our very patient wives Bailey and Chaz as the make us and Data & Sons much, much better in so many ways. We are big believers that Data & Sons will enable millions to join the knowledge economy by selling data on our marketplace. The amount and diversity of information in a society are two of the biggest predictors of that society’s new knowledge development. By providing a more equitable way to acquire and transfer data, we genuinely believe making all of this new data available will speed innovation and social progress. Thanks to everyone that is making this a reality. 2017 was a major year for Data & Sons and we are proud to announce we finished up strong. In December, our site traffic was up over 1000% from November and we continue to see strong user growth with more datasets being added to our marketplace. Our initial sales and site traffic have resulted in our first two investors coming on board. Both investors are seasoned tech entrepreneurs and leaders and we are very excited about the rapid growth we can continue to foster with their capital and expertise! In November, we anticipated completing a partnership agreement with a data brokerage to provide a greater variety of data. I am happy to announce we now have two such partnerships. The partnerships will mean Data & Sons buyers will now have access to over 20 Million business and customer contacts! We think this will be a win for all involved in the Data & Sons marketplace. Look for the new data to be added to the marketplace in early January. We also announced our data request and affiliate partner management platform. We are pleased to report the data request feature is in full development and the affiliate management platform in now in beta. The data request feature provides buyers the ability to request datasets at a specified price. This effectively creates demand in our marketplace reducing some of the uncertainty around not knowing what data will sell and for what price. Our affiliate partner management system enables people that refer data sellers to Data & Sons to receive a commission whenever their affiliated seller sells any data. We think this will drive a lot of new content to the site and allow us to rapidly scale. What’s next in 2018 January will see our first marketing campaign directed at driving buyers to our marketplace. We will focus on Lead Generation datasets since this has been the most active category on Data & Sons. Lead Generation datasets provide buyers the contact information of perspective customers they can use for direct (email, phone, mail) and social media (Facebook Audiences) marketing. We think the start of a new year os the perfect time to help businesses find new customers. We will also continue to bring on investors and will be presenting on January 25th at the Wave Tampa Bay. Our presentation will provide a more detailed understanding of Data & Sons and our revolutionary business model. Please contact me if you would like to attend. We anticipate continued investor funding will allow us to continue to grow Data & Sons. We will be adding several new team members in 2018. Our new team members will be focused on (1) growing specific data categories by finding data sellers and buyers underserved by how data is developed, acquired, and transferred for these types of data; (2) developing outstanding marketing content that educates buyers and sellers on how to make money on Data & Sons; and (3) more development engineers. Adding to our development team enables us to stay adaptive and continue to roll out great new features. We anticipate adding a bidding function to the marketplace. Buyers can make a bid on any dataset and the Seller can than take or counter the bid. This will make our market more price efficient. We will also be adding a tutorial section. Buyers and Sellers can learn how they can make money on Data & Sons’ revolutionary marketplace either by developing and/or acquiring data for sale. We hope 2017 was as an exciting year for you as it was for us. Here’s to all of us being empowered, successful and safe in 2018!

By Sean Lux

What is a Data Scientist? I’ve been trying to answer this question for well over a year. As an academic turned entrepreneur, I was intrigued by the title data scientist. We were building Data & Sons at the time, and identifying core customers was a key part of the design process. It seemed that people that both developed and utilized data would be natural sellers and buyers on our marketplace. Sounded like data scientists might just be this type of person. Answering that question took a surprisingly long time. After spending over a year reading, researching and discussing with people across Fortune 500 companies, startups, data science centric social media, and data science training programs, I think I have working solution. A prototype data scientist definition if you will. Given the number of posts on DataTau, Medium, and Reddit asking this same question, I think taking the time to put together a solid working definition is value added for lots of people in the data science field (profession, community, industry?) especially for people interested in joining the profession. So first, what’s data science? As an organizational scientist, I learned and applied the traditional scientific method: review/observation, theory/hypotheses development, collect data, test hypotheses using statistical analysis, and hope to find something publishable. The idea is that the data you collected (your sample) was generalizable to the overall population. So if you found results that supported your hypothesis in a sample of 600 people, you would argue this would be the case in the greater population when you published the study. Then along came big data. You would no longer need a sample because you could plausibly have the entire population. Instead of 600 people, you now had 4 Million if you were Facebook. No need to mess with theory and hypothesis development, you simply ran statistical analysis of the population and the results told you everything you needed to know. This is what led Wired Editor Chris Andersen to observe that Theory is Dead in 2008.  It is in this context that Jim Gray coined the term data science.  Data science was accumulating enough data that you could skip theory and hypotheses development and rely on the statistical relationships you found in the data. Data science is essentially a science hack. So does that make data scientists science hackers? I’m going to say no for one primary reason: you need more skills for data science than you do for traditional science. So sure, it’s a hack of the scientific method, but it takes more dedicated learning, experience, and effort to be able to hack that process. Not a very good hack if it requires more effort. It is possessing these skills that I think makes someone a data scientist. Therefore: Data scientist are professionals competent in statistical analysis, computer programming, and applied problem solving in their domain of interest. The Venn diagram below illustrates how possessing different combinations of these skills makes people good at different data centric jobs. Because there are so many people running around calling themselves data scientists today, I think the diagram also does a good job of illustrating who is not a data scientist. Let’s review each. Statistical Competence. I put this at the top of the Venn diagram because understanding statistics is at the core of data science (or really any other data centric role). The whole point is to skip theorizing to rely on statistical relationships. If you cannot find these relationships in your data, cannot play in data science. This also means you will need to be proficient in R, SPPS, SAS, or Stata, and likely some of the method/model specific software packages. Applied Problem Solving. I think there are lot’s of people out there that have statistical competence and/or computer programming skills with “data scientist” in their current job title. I would however argue that they are not data scientists. Why? Remember the first part of the scientific process is review/observation, which is studying and trying to develop a basic understanding of some subject or phenomenon before you start asking your own research questions. What do people already know about this subject? What don’t we know yet? While big data may take away the need for developing new theory and hypotheses, you still need to know what it is you are studying. If you don’t, you’re going to spend a lot of time and resources to get obvious answers to stupid questions. There’s no faster way to get marginalized in an organization than making more money than most people in the room and presenting them with a detailed research project that tells them exactly what they already knew five years ago. A data scientist has to know what questions to ask. This requires that you develop a thorough understanding of whatever you are examining with data science (e.g. business, public policy, educational outcomes, etc.). The practitioners (business people, policy wonks, educators, etc) know their subject area, but they often do not understand the tools data scientists bring to the table and thus have no idea what to tell you to do. In the 2017 Kaggle State of Data Science Survey, the fifth most cited barrier at work (30.2% of respondents) was “Lack of a Clear Question to Answer.” If you don’t know what questions to ask, you cannot have scientist in your job title. Inquiry, whether done through thoughtful theory development or studying massive amounts of data, is at the heart of ALL science. All inquiry starts with asking the right questions. Computer Programming. Large amounts of well organized, accurate, and authentic data is the world’s most valuable resource. This means you are unlikely to just come across it anytime soon so you’ll need to develop it yourself. You will also need to do this on a repeated basis (i.e. not a one time data collection). This maybe a few times a year, once a day, or continuously in real time. To collect and analyze data on a repeated basis, you’ll need to build a system that (1) acquires and updates data; (2) organizes that data from different sources into a coherent structure; (3) can pass that data into some sort of statistical analysis; (4) presents results in a clear manner (often as visualization); (5) all on an automated basis. It’s this last part (the automation) that separates people proficient in statistical analysis who can accomplish tasks 1-4 from data scientists. Most academic researchers (PhD types like me) are highly proficient in tasks 1-4, but are completely clueless when asked to repeat that process on a ongoing basis. Automating that process requires being able to tell a computer to do it, and that requires proficiency in Python, SQL, C++, and/or some other programming language. While strong in statistical analysis and applied porblem solving, I would not identify as a data scientist until I had imporved upon my current Python and SQL skills...unless of course you had a lot of money to throw at me.   Reality is the job market for data scientists is very, very hot right now. I realize there are and will continue to be more and more people calling themselves data scientists that do not possess all three of the skills identified.  I do think the three skills provide a good educational progression for becoming a data scientist. Starting with stats, moving to programming, and then gaining a solid understanding of the area you are going to apply your craft is a good educational progression. Likely, you will be marketable with a solid statistics background (Data Incubator and Insight Data Science both exist to train you up on the programming side while getting you hired), you will be highly desired as someone with both statistics and programming skills, and once you have several years experience in a particular industry, you will be extremely sought after and courted as a full fledged data scientist.