Additional Resources#
Python and Jupyter Notebooks#
If you want to get an introduction to Python and/or Jupyter Notebooks, I recommend the following resources from Project Pythia:
Project Pythia Foundations also provides tutorials on various core scientific Python packages, such as NumPy, Matplotlib and Pandas, which you will likely encounter at some point.
Xarray, Dask and lazy loading#
The load_product
-function returns an xarray.Dataset
object, which is a
powerful data structure for working with multidimensional data. Xarray
is a Python library that “[…] introduces labels in the form of dimensions,
coordinates and attributes on top of raw NumPy-like arrays, which allows for more
intuitive, more concise, and less error-prone user experience.”.
See the following resources for more information:
Xarray Documentation (Very important resource! 😉)
Xarray closely integrates with the Dask library, which is a “[…] flexible library for parallel computing in Python.” and allows for datasets to be loaded lazily, meaning that the data is not loaded into memory until it is actually needed. This is especially useful when working with large datasets that might not fit into the available memory. These large datasets are split into smaller chunks that can then be efficiently processed in parallel.
Most of this is happening in the background, so you don’t have to worry too much about
it. However, it is important to be aware of it, as it affects the way you need to
work with the data. For example, you need to be careful when applying certain
Xarray operations, such as calling .values
,
as they might trigger the entire dataset to be loaded into memory and can result in
performance issues if the data has not been aggregated
or indexed beforehand.
Furthermore, you might reach a point where you need to use advanced techniques
to optimize your workflow, such as re-orienting the chunks or persisting
intermediate results in memory. For now, just keep all of this in mind and reach
out to me if you have any questions or need help with optimizing your workflow.
The following resources provide more information:
Digital Earth Africa#
Tutorials#
The two main data products of the SDC, Sentinel-1 RTC and Sentinel-2 L2A, are direct copies of the open and free “Analysis Ready Data” products provided by Digital Earth Africa (DE Africa).
The team of DE Africa provides a lot of very helpful tutorials as Jupyter Notebooks. Some of these tutorials cover more advanced and analysis-specific topics to address real-world problems. While the loading of the data differs between these tutorials and the SDC, most of the analysis techniques can be directly applied to the SDC data products as well. It is therefore highly recommended to have a look at the tutorials in the course of your work with the SDC data products:
deafrica-tools
package#
Some of these tutorials are using a package called deafrica-tools
, which includes
useful functions and utilities, e.g. for the calculation of vegetation phenology statistics. You can find the package on GitHub:
If you want to use any functions of deafrica-tools
and need assistance with the
installation or usage of the package, please let me know!