About Sollers

Sollers is a graduate school located in New Jersey, specializing in clinical research, drug safety and pharmacovigilance training.

Our graduate certificate and masters programs cover a wide range of subjects tailored to this fast growing industry, and our graduates go on to highly successful careers in the pharmaceuticals industry and healthcare industries.

  • HOURS
  • Monday - Thursday | 10 AM - 7 PM
  • Friday | 12 PM - Midnight
  • Saturday | 12 PM - Midnight
  • Sunday | Closed
  • OPEN 24/7 - sollers.edu
    • PHONE
    • (848) 299-5900
    • Location
    • 100 Menlo Park, Suite 550
      Edison New Jersey 08837 -2488

Location

Call Us Now: 848 299-5900

Sollers Blog

The Significance Of Statistics For A Data Scientist!

Posted by Doctor Dan on Feb 16, 2017 11:18:39 AM

Data science is a discipline or field that is hard to define or describe with a consensus since almost everyone associated with data appears to have a different definition of it. Thus, rather than a definition, a description of what involves data science might help understand its relation to statistics too. Data science can be described as involving data collection and organization, machine learning and modeling using programs, statistical analysis using computational tools, and developing prototypical models that make sense of the data in a way in which it can be commercially used.

Data_Science_Discussion.jpg

There are no disagreements that data science and statistics are intertwined and that data science has emerged out of the unwillingness of statisticians to adapt to the digital age and computer generated data. Statistics provided a way, in the pre-computer era, to make sense of data and predict events based on the data. Trading in the stock market, in the pre-computer era, was facilitated by statisticians crunching numbers and data and sending them to traders. But with the advent of computer programming, programmers can model programs that can crunch data at inhuman speeds and predict market fluctuations and events much better than traditional statisticians.

Despite this statistics is important to data science, because all computer models are not perfect. They may not be able to provide good reusable prototypes to help businesses. Data scientists who possess statistical knowledge or at least strong basics in statistics can be an asset.

Data scientists can thus be termed as those who have strong basics in statistics, knowledge in a programming language and ability to model and adapt that language to the requirements, and an ability to discover patterns and perform analysis on the data collected.

While the essence of data science is to discover hidden information from a large mass of data, statistics leans more towards the careful use of data. Overall statistics is important for an aspiring data scientist. If one is interested in tweaking models to make processing of data analysis faster, then statistics will not be considered essential or even important. If one wishes to become a machine learning expert and wants to create deep learning models which are artificially intelligent and respond to human interactions then a strong base in statistics is considered to be essential. Without statistics, a data analyst will not know if the pattern that he/she found during data analysis is real or false or predictive.

A data scientist wishing to progress along the machine learning and deep learning path must mandatorily possess knowledge about statistics. A beginning towards this can be made by attempting to learn statistics along with a heavy focus on coding through either python or R programming languages.

A good intuition of what distribution statistic model should be used and where it should be used is also an important skill a data scientist should possess. Apart from this, awareness about strong basics in traditional statistics like knowing what is Bayesian theory, classical hypothesis testing like p-values, null hypotheses, etc is also recommended.

Topics: Data Science