Skip to content

IGO LIMS REST API

Stephanie DelBelso edited this page Aug 30, 2023 · 16 revisions

The endpoints retrieve patient metadata, IGO processing information, sequencer run information and fastq paths.

They were originally developed so that the ACCESS, IMPACT and WES pipelines could automatically pull all the metadata and fastq information needed to start those pipelines.

Swagger lists the available endpoints and all fields returned, the data team can supply account information to access the endpoints.

During the post-sequencing IGO QC process prior to delivery, entire sequencer runs are archived that may have fastqs that both passed and failed QC. One important aspect to retrieving fastq information from the endpoint is that no failed fastqs are returned by the endpoints.

IGO LIMS FIELD MAPPING

Field as Reported LIMS Database Table.Column Picklist
baitSet SeqAnalysisSampleQC.BaitSet Bait Selection Choices
cfDNA2dBarcode Sample.MicronicTubeBarcode
cmoInfoIgoId SampleCMOInfoRecords.SampleId
cmoPatientId SampleCMOInfoRecords.CmoPatientId
cmoSampleClass SampleCMOInfoRecords.CMOSampleClass CMO Sample Class (not used in Sample Sub)
naToExtract* Sample.NAtoExtract Nucleic Acid Type to Extract
recipe* Sample.Recipe Recipe
sampleType* sample.ExemplarSampleType Exemplar Sample Types
cmoSampleName SampleCMOInfoRecords.CorrectedCMOID
collectionYear SampleCMOInfoRecords.CollectionYear
igoId Sample.igoId
investigatorSampleId SampleCMOInfoRecords.UserSampleID
sampleName SampleCMOInfoRecords.OtherSampleId
sampleOrigin SampleCMOInfoRecords.SampleOrigin Sample Origins
sex SampleCMOInfoRecords.sex Gender
species SampleCMOInfoRecords.Species Species
specimenType SampleCMOInfoRecords.SpecimenType Specimen Types
tissueLocation SampleCMOInfoRecords.TissueLocation
tubeId Sample.TubeBarcode
tumorOrNormal SampleCMOInfoRecords.TumorOrNormal Tumor or Normal
oncoTreeCode SampleCMOInfoRecords.TumorType picklist from oncotree
preservation SampleCMOInfoRecords.Preservation Preservation
  • Fields added so SMILE could generate CMO Sample IDs

Downtime

The LIMS resets nightly so the endpoints will be unavailable for a few minutes, currently from 2:30am-2:35am, also periodically to install updates the endpoints will be down for up to 20 minutes.

IGO QC Recommendations

Of particular importance to pipeline analysis is when IGO QC recommendation is to fail a sample but the investigator decides to proceed with sequencing.

In February of 2020, four IGO QC fields were added to /api/getSampleManifest, a failed sample is shown below with the IGO Recommendation, comments and then the investigator decision.

"qcReports": [ { "qcReportType": "LIBRARY", "IGORecommendation": "Failed", "comments": "Low quantity", "investigatorDecision": "Stop processing at this time" }

Is a request a CMO Request?

That logic is documented and implemented here https://github.com/mskcc/LimsRest/blob/master/src/main/java/org/mskcc/limsrest/service/cmorequests/CheckOrMarkCmoRequestsTask.java

Re-delivery of the same project

It is possible that IGO will on rare occasions deliver the same project multiple times adding runs and fastqs. In these cases the /api/getDeliveries endpoint will only list each request one time with the most recent delivery date, prior delivery dates are not retrievable from the endpoint.

MSK-ACCESS Specific Field

To help tracking of samples for MSK-ACCESS, the LIMS field "MicronicTubeBarcode" is returned in the JSON response as "cfDNA2dBarcode".

RNASEQ Projects

Since 2019 IGO does three types of RNASeq projects: RNASeq-TruSeqRiboDeplete,RNASeq-TruSeqPolyA and RNASeq-SMARTerAmp for these three types of projects only the endpoints report strand(non-stranded:SMARTerAmp or stranded-reverse:both TruSeq) & LibraryType fields. For historical projects prior to 2019 such as RNASeq-TruSeqFusion strand results in the endpoint are not reliable.

Example Python Script to Connect to the Endpoint

# This script connects to the LIMS endpoint and prints fastq.gz paths for a given IGO sample ID.
import sys
import requests
from requests.auth import HTTPBasicAuth

LIMS_ENDPOINT = "https://igolims.mskcc.org:8443/LimsRest/api/getSampleManifest?igoSampleId="
LIMS_USER = 'user'
LIMS_PASS = '*******'

def main():
    igo_sample_id = sys.argv[1]

    lims_url = LIMS_ENDPOINT + igo_sample_id
    print("Sending LIMS GET request - {}".format(lims_url))

    # send query for one sample at a time
    response = requests.get(lims_url, verify=False, auth=HTTPBasicAuth(LIMS_USER, LIMS_PASS)).json()
    print("Received response: {}".format(response))
    # Traverse the response for each library and run
    for library in response[0]['libraries']:
        for run in library['runs']:
            print("On Flowcell {} ".format(run['flowCellId']))
            print("There were the following fastqs {}".format(run['fastqs']))

if __name__ == '__main__':
    main()