Julia for High Performance Data Analysis
септември 18, 2023 2023-09-18 15:21Julia for High Performance Data Analysis
Julia is a modern high-level programming language that is fast (on par with traditional HPC languages like Fortran and C) and relatively easy to write like Python or Matlab. It thus solves the “two-language problem”, i.e. when prototype code in a high-level language needs to be combined with or rewritten in a lower-level language to improve performance.
Although Julia is a general-purpose language, many of its features are particularly useful for numerical scientific computation, and a wide range of both domain-specific and general libraries are available for statistics, machine learning and numerical modelling.
This online workshop will start by briefly covering the basics of Julia’s syntax and features, and then introduce methods and libraries which are useful for writing high-performance code for modern HPC systems. After attending the workshop, you will:
- Be comfortable with Julia’s syntax, in-built package manager, and development tools.
- Understand core language features like its type system, multiple dispatch, and composability.
- Be able to write your own Julia packages from scratch.
- Know how to perform various linear algebra analysis on datasets.
- Be productive in analysing and visualising large datasets in Julia using dataframes and visualisation packages.
- Be familiar with several Julia libraries for visualisation and machine learning.
- Understand how to analyse large datasets efficiently in Julia using statistical methods.
Prerequisites
- Experience in one or more programming languages.
- Familiarity with basic concepts in linear algebra and machine learning.
- Basic experience with working in a terminal is also beneficial.
- Participants are expected to install Julia, VSCode and Zoom before the workshop starts
Tentative agenda
Day 1
Introduction to Julia syntax and features.
Day 2
Julia for data analysis, data frames, visualization, various data formats, read/write data, missing data.
Linear algebra, array matrix and vector operations, performance comparisons, random matrices, sparse matrices, eigenvalues/eigenvectors and PCA.
Day 3
Clustering, classification, machine learning, deep learning.
Day 4
Regression, time series analysis and prediction.
Registration
Registrations are now closed.
Disclaimer
This training is intended for users established in the European Union or a country associated with Horizon 2020. You can read more about the countries associated with Horizon2020 here https://ec.europa.eu/info/research-and-innovation/statistics/framework-programme-facts-and-figures/horizon-2020-country-profiles_e