Skip to content

paweljakubas/j-data-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data analysis using J

by Pawel Jakubas, PhD

These are notes that cover a number of topics that I have found fundamental to master data analysis using J language. The prerequisite for fully comprehending the examples below is Learning J which is the recommended first introductory material when learning J. The great example how beautifully and effciently J can be used in a specific domain is a wonderful Fractals, Visualization and J. Besides that a list of high quality book references is specified. The notes are supposed to be hands-on and to illustrate how efficient data analysis can be performed using J. The topics and techniques presented reflect the author's subjective take on what is crucial to master the many tasks that are required for powerful data analysis.

SQL, data analysis, probability, statistics and machine learning

  1. SQL Cookbook, Anthony Molinaro, Robert de Graaf, 2nd ed., O'Reilly 2021
  2. SQL for Data Analysis: Advanced Techniques for Transforming Data into Insights, Cathy Tanimura, O'Reilly 2021
  3. Data Analysis Techniques for Physical Scientists, Claude A. Pruneau, Cambridge University Press 2017
  4. Statistics, Data Mining, and Machine Learning in Astronomy: A Practical Python Guide for the Analysis of Survey Data, Željko Ivezić, Andrew J. Connolly, Jacob T. VanderPlas, and Alexander Gray, Updated Edition, Princeton University Press 2020
  5. Information Theory, Inference, and Learning Algorithms, David J.C. MacKay, Cambridge University Press 2003
  6. Introduction to Probability Models, Sheldon M. Ross, 12th ed., Academic Press 2019
  7. Statistical Inference, George Casella, Roger L. Berger, 2nd ed, Cengage Learning 2001
  8. Computational Statistics, Geof H. Givens and Jennifer A. Hoeting, 2nd ed, Wiley 2013
  9. Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies, John D. Kelleher, Brian Mac Namee, Aoife D'Arcy, 2nd ed, MIT Press 2020
  10. Statistical Rethinking: A Bayesian Course with Examples in R and STAN, Richard McElreath, 2nd ed, CRC 2020

J language

  1. Learning J. An Introduction to the J Programming Language, Roger Stokes, [https://www.jsoftware.com/help/learning/contents.htm#toc]
  2. Fractals, Visualization & J, 4th ed. (2 Parts), Clifford A. Reiter 2016
  3. Fifty Shades of J, Norman Thomson, [https://code.jsoftware.com/wiki/Fifty_Shades_of_J]
  4. Linear algebra and random matrices using J, Pawel Jakubas, [https://github.com/paweljakubas/j-random-matrices]
  5. Numerical methods using J, Pawel Jakubas, [https://github.com/paweljakubas/j-numerical-methods] (coming soon)

Throughout the code J903 version of J lang was used.

Contents

[Data analysis cases using SQL and J's approach]

[Information-based learning]

[Similarity-based learning]

[Deep look at statistical distributions]

[Classical inference]

[Bayesian inference]

[Error-based learning]

[Deep learning]

[Case study - Titanic]

[Case study - Multimodal Single-Cell Integration]

[Case study - Galaxy classification]

[Case study - Treasury yield curves and surfaces]

Releases

No releases published

Packages

No packages published

Languages