Additional Resources

Additional Resources#

Python and Jupyter Notebooks#

If you want to get an introduction to Python and/or Jupyter Notebooks, I recommend the following resources from Project Pythia:

Project Pythia Foundations also provides tutorials on various core scientific Python packages, such as NumPy, Matplotlib and Pandas, which you will likely encounter at some point.

Xarray, Dask and lazy loading#

The load_product-function returns an xarray.Dataset object, which is a powerful data structure for working with multidimensional data. Xarray is a Python library that “[…] introduces labels in the form of dimensions, coordinates and attributes on top of raw NumPy-like arrays, which allows for more intuitive, more concise, and less error-prone user experience.”.

See the following resources for more information:

Overview: Why Xarray?
Tutorial: Xarray in 45 minutes
Xarray Documentation (Very important resource! 😉)

Xarray closely integrates with the Dask library, which is a “[…] flexible library for parallel computing in Python.” and allows for datasets to be loaded lazily, meaning that the data is not loaded into memory until it is actually needed. This is especially useful when working with large datasets that might not fit into the available memory. These large datasets are split into smaller chunks that can then be efficiently processed in parallel.

Most of this is happening in the background, so you don’t have to worry too much about it. However, it is important to be aware of it, as it affects the way you need to work with the data. For example, you need to be careful when applying certain Xarray operations, such as calling .values, as they might trigger the entire dataset to be loaded into memory and can result in performance issues if the data has not been aggregated or indexed beforehand. Furthermore, you might reach a point where you need to use advanced techniques to optimize your workflow, such as re-orienting the chunks or persisting intermediate results in memory. For now, just keep all of this in mind and reach out to me if you have any questions or need help with optimizing your workflow.

The following resources provide more information:

Digital Earth Africa#

Tutorials#

The two main data products of the SDC, Sentinel-1 RTC and Sentinel-2 L2A, are direct copies of the open and free “Analysis Ready Data” products provided by Digital Earth Africa (DE Africa).

The team of DE Africa provides a lot of very helpful tutorials as Jupyter Notebooks. Some of these tutorials cover more advanced and analysis-specific topics to address real-world problems. While the loading of the data differs between these tutorials and the SDC, most of the analysis techniques can be directly applied to the SDC data products as well. It is therefore highly recommended to have a look at the tutorials in the course of your work with the SDC data products:

DE Africa Real World Examples

`deafrica-tools` package#

Some of these tutorials are using a package called deafrica-tools, which includes useful functions and utilities, e.g. for the calculation of vegetation phenology statistics. You can find the package on GitHub:

Digital Earth Africa Tools Package

If you want to use any functions of deafrica-tools and need assistance with the installation or usage of the package, please let me know!