A Level: Data Science Using Python (A10.1-R5, NIELIT / DOEACC)



    Data science is an interdisciplinary field that uses scientific processes and various algorithms to extract knowledge and insights from data that may be structured and unstructured.
    Python has gathered a lot of interest recently as a choice of language for data analysis/science. Python is a free and open-source and a general-purpose programming language that is easy to learn. Python, due to its versatility, is ideal for implementing the steps involved in data science processes.
    Python is being used for web development, data analysis, artificial intelligence, and scientific computing. The three best and most important Python libraries for data science are NumPy, Pandas, and Matplotlib. NumPy and Pandas are used for analyzing and exploring data. Matplotlib is a data visualization library used for making various types of graphs depicting the analysis.


    With the growth in the IT industry, there is a booming demand for skilled Data Scientists, and Python has evolved as the most preferred programming language for the same. This course will focus on fundamental python programming techniques, reading and manipulating CSV files, and the various libraries for data science.

    After completing the module, the student will be able to:

    • Take tabular data and clean it
    • Manipulate the data
    • Run basic inferential statistical analyses.
    • Perform Data Analysis
    • Perform Visualization of analysis
    • Built a Front end GUI

    120 Hours – (Theory: 48 hrs + Practical: 72 hrs)

    Detailed Syllabus

    (i) Python Language, Structures, Programming Constructs
    Review of Python Language, Data types, variables, assignments, immutable variables, Strings, String Methods, Functions and Printing, Lists and its operations, Tuples and Dictionaries programs, Slicing strings, lists, tuples.

    (ii) Data Science and Analytics Concepts
    What is Data Science and Analytics? The Data Science Process, Framing the problem, Collecting, Processing, Cleaning and Munging Data, Exploratory Data Analysis, Visualizing results.

    (iii)Introduction to NumPy Library
    Numpy: Array Processing Package, Array types, Array slicing, Computation on NumPy Arrays – Universal functions, Aggregations: Min, Max, etc., N-Dimensional arrays, Broadcasting, Fancy indexing, sorting arrays, loading data in Numpy from various formats.

    (iv) Data Analysis Tool: Pandas
    Introduction to the Data Analysis Library Pandas, Pandas objects – Series and Data frames, Data indexing and selection, Nan objects, Manipulating Data Frames, Grouping, filtering, Slicing, Sorting, Ufunc, Combining Datasets- Merge and join. Query Data Frame structures for cleaning and processing, lambdas. Aggregation functions and applying user-defined functions for manipulations.

    (iv) Statistical Concepts and Functions
    Statistics module, manipulating statistical data, calculating results of statistical operations. Python Probability Distribution, Functions like mean, median, mode, and standard deviation. Concept of Correlation and Regression.

    (v) Matplotlib
    Visualization with Matplotlib, Simple line plots, scatter plots, Density and Contour plots – visualizing functions, Multiple subplots, Plotting histograms, bar charts, scatter graphs, and line graphs.

    (vi) GUI – Tkinter
    Tk as Inbuilt Python module creating GUI applications in Python. Creating various widgets like button, canvas, label, entry, frame, check button, label, etc. Geometry Management: pack, grid, place, organizing layouts, and widgets, binding functions, mouse-clicking events. Building the complete interface of a project.

    (vii) Machine Learning: The Next Step
    What is Machine Learning? Types of Machine Learning Algorithms, Training the data, and Introduction to Various Learning Algorithms. Applications of Machine Learning.

    Course Reviews


    • 5 stars0
    • 4 stars0
    • 3 stars0
    • 2 stars0
    • 1 stars0

    No Reviews found for this course.