Using GEOparse package to download NGS data
Published:
I used the Python package GEOparse to access and download data from GSE277702. This was part of my effort to replicate analyses from a manuscript on aging in the heart.
import GEOparse
# path to the raw data directory
DATA_PATH = "/home/sychen9584/projects/cardio_paper/data/raw/"
gse = GEOparse.get_GEO(geo="GSE277702", destdir=DATA_PATH)
gse.show_metadata()
!Series_title = Decoding Aging in the Heart via Single Cell Dual Omics of Non-Cardiomyocytes
!Series_geo_accession = GSE277702
!Series_status = Public on Nov 01 2024
!Series_submission_date = Sep 20 2024
!Series_last_update_date = Jan 11 2025
!Series_web_link = https://www.cell.com/iscience/fulltext/S2589-0042(24)02696-8
!Series_summary = This SuperSeries is composed of the SubSeries listed below.
!Series_overall_design = Refer to individual Series
!Series_type = Expression profiling by high throughput sequencing
!Series_type = Genome binding/occupancy profiling by high throughput sequencing
!Series_contributor = Li,,Wang
!Series_sample_id = GSM8527843
!Series_sample_id = GSM8527844
!Series_sample_id = GSM8527845
!Series_sample_id = GSM8527846
!Series_sample_id = GSM8527847
!Series_sample_id = GSM8527848
!Series_sample_id = GSM8527849
!Series_sample_id = GSM8527850
!Series_sample_id = GSM8527851
!Series_sample_id = GSM8527852
!Series_sample_id = GSM8527853
!Series_sample_id = GSM8527854
!Series_contact_name = Yiran,,Song
!Series_contact_email = yrsong001@gmail.com
!Series_contact_phone = 9843221295
!Series_contact_institute = The University of North Carolina
!Series_contact_address = 2000 Baity Hill Drive, Apt 226
!Series_contact_city = Chapel Hill
!Series_contact_state = NC
!Series_contact_zip/postal_code = 27514
!Series_contact_country = USA
!Series_supplementary_file = ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE277nnn/GSE277702/suppl/GSE277702_RAW.tar
!Series_platform_id = GPL19057
!Series_platform_id = GPL24247
!Series_platform_taxid = 10090
!Series_sample_taxid = 10090
!Series_relation = SuperSeries of: GSE277700
!Series_relation = SuperSeries of: GSE277701
!Series_relation = BioProject: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1163378
# Download expression matrices
gse.download_supplementary_files(directory=DATA_PATH)