Online
January 7-8, 2021
11:00 am - 6:00 pm EST
Instructors: Allen Downey, Azalee Bostroem, Rodolfo Montez
Helpers: Meredith Rawls, Iva Momcheva
Data Carpentry develops and teaches workshops on the fundamental data skills needed to conduct research. Its target audience is researchers who have little to no prior computational experience, and its lessons are domain specific, building on learners' existing knowledge to enable them to quickly apply skills learned to their own research. Participants will be encouraged to help one another and to apply what they have learned to their own research problems.
For more information on what we teach and why, please see our paper "Good Enough Practices for Scientific Computing".
The astronomy-tailored curriculum is designed to provide astronomers with essential skills for data-intensive analysis and visualization. The curriculum focuses on building complex SQL queries using Astroquery, working with the retrieved data in Astropy Tables and Pandas DataFrames, storing the data locally for future use, and communicating the results with clear and compelling figures using Matplotlib.
This workshop is intended for folks in the Astronomy community at all stages of their education and careers. Participants are expected to have knowledge equivalent to the Software Carpentry Python Curriculum: the ability to write a function in Python, familiarity with Python built-in types such as lists and dictionaries, and the ability to navigate directories using the command line. In addition, we welcome participants who are familiar with the concepts presented and would like to provide comprehensive feedback on the lessons.
Where: This training will take place online. The instructors will provide you with the information you will need to connect to this meeting.
When: January 7-8, 2021. Add to your Google Calendar.
Requirements: Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below). Participants are expected to have knowledge equivalent to the Software Carpentry Python Curriculum: the ability to write a function in Python, familiarity with Python built-in types such as lists and dictionaries, and the ability to navigate directories using the command line.
Accessibility: We are dedicated to providing a positive and accessible learning environment for all. Please notify the instructors in advance of the workshop if you require any accommodations or if there is anything we can do to make this workshop more accessible to you.
Contact: Please email downey@allendowney.com or abostroem@gmail.com for more information.
Roles: To learn more about the roles at the workshop (who will be doing what), refer to our Workshop FAQ.
Everyone who participates in Carpentries activities is required to conform to the Code of Conduct. This document also outlines how to report an incident if needed.
Please be sure to complete these surveys before and after the workshop.
For question 3 "Which of the following workshops are you attending?" please choose "I don't know".
Given the virtual nature of this workshop, three 30 minute breaks will be given rather than two short breaks and a long lunch break. In between each 30 minute break there will be a five minute break, after approximately 30 minutes of instruction.
Before starting | Pre-workshop survey |
11am-12:20pm | Welcome |
Introduction | |
Querying remote databases and downloading results (lesson 1) | |
12:20pm-12:50pm | Break |
12:40pm-2:00pm | Querying remote databases and downloading results cont. (lesson 1) |
Filtering data by coordinate with Astropy (lesson 2) | |
2:00pm-2:30pm | Break |
2:30pm-3:50pm | Filtering data by coordinate with Astropy cont. (lesson 2) |
Selecting data by proper motion and visualizing selection (lesson 3) | |
3:50pm-4:10pm | Break |
4:10pm-6pm | Selecting data by proper motion and visualizing selection cont. (lesson 3) |
11am-12:20pm | Writing reusable functions (lesson 4) |
12:20pm-12:50pm | Break |
12:40pm-2:00pm | Combining information from more than one table (lesson 5) |
2:00pm-2:30pm | Break |
2:30pm-3:50pm | Selecting and filtering data with Pandas (lesson 6) |
3:50pm-4:10pm | Break |
4:10pm-5:50pm | Creating a publication quality figure with Matplotlib (lesson 7) |
5:50-6:00pm | Post-workshop survey |
END |
If you haven't used Zoom before, go to the official website to download and install the Zoom client for your computer.
If you have used it before, make sure you have the most recent client installed. In particular, make sure you have a version that allows you to self-select a breakout room.
Like other Carpentries workshops, you will be learning by "coding along" with the Instructors. To do this, you will need to have both a window running a Jupyter notebook and a window for the Zoom video conference client. In order to see both at once, we recommend using one of the following set up options:
For this workshop, we encourage you to work in a Jupyter notebook. If you are not familiar with Jupyter, you can run a tutorial by clicking here. Then select “Try Classic Notebook”. It will open a notebook with instructions for getting started.
You will need to install Python, Jupyter, and some additional libraries. If you don’t already have Jupyter, we recommend installing Anaconda, which is a Python distribution that contains everything you need to run the workshop code. It is easy to install on Windows, Mac, and Linux, and because it does a user-level install, it will not interfere with other Python installations.
Information about installing Anaconda is here.
If you have the choice of Python 2 or 3, choose Python 3.
There are two ways to get the libraries you need:
Installing libraries in an existing environment is simpler, but if you use the same environment for many projects, it will get big, complicated, and prone to package conflicts.
It is also possible to install the libraries you need using pip
, but we don’t recommend it.
To create a new Conda environment, you’ll need to download an environment file from our repository. On Mac or Linux, you can download it using wget
on the command line:
wget https://raw.githubusercontent.com/AllenDowney/AstronomicalData/main/environment.yml
Or you can download it using this link; make sure the filename is environment.yml
, not environment.yml.txt
.
In a Terminal or Jupyter Prompt, make sure you are in folder where environment.yml
is stored, and run:
conda env create -f environment.yml
Then, to activate the environment you just created, run:
conda activate AstronomicalData
The libraries we need can be installed using Conda, by running the following commands in a Terminal. If you are on a Mac or Linux machine, you should be able to use any Terminal. If you are on Windows, you might have to use the Anaconda Prompt, which you can find under the Start menu.
conda config --append channels conda-forge
conda install jupyter numpy scipy pandas matplotlib seaborn libopenblas
conda install -c conda-forge astropy astroquery gala python-wget
If you are on Windows, you might have to install gala with pip, like this:
pip install gala
Before you launch Jupyter, download this notebook, which contains code to test your environment.
Or you can use wget
to download it on the command line, like this:
wget https://raw.githubusercontent.com/AllenDowney/AstronomicalData/main/test_setup.ipynb
To start Jupyter, make sure the right Conda environment is activated, then run:
jupyter notebook
Jupyter should launch your default browser or open a tab in an existing browser window. If not, the Jupyter server should print a URL you can use. For example, when I launch Jupyter, I get
$ jupyter notebook
[I 10:03:20.115 NotebookApp] Serving notebooks from local directory: /home/username
[I 10:03:20.115 NotebookApp] 0 active kernels
[I 10:03:20.115 NotebookApp] The Jupyter Notebook is running at: http://localhost:8888/
[I 10:03:20.115 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
In this example, the URL is http://localhost:8888.
When you start your server, you might get a different URL.
Whatever it is, if you paste it into a browser, you should should see a home page with a list of directories.
Now open the notebook you downloaded, test_setup.ipynb
, and run the cells that contain import
statements.
If they work and you get no error messages, you are all set.
If you get error messages about missing packages, you can install the packages you need using Conda or pip
.
At the end of the notebook, you’ll be asked to copy and paste a line of code from our Slack workspace to the Jupyter notebook and run it. The reason for this test is that some environments convert “straight” quotation marks to “smart” quotation marks, which breaks Python code. If you encounter this problem, you might have to check your system settings to turn off this “feature”.
If you run into problems with these instructions, let us know and we will make corrections. Good luck!