Table of Contents
This part of the tutorial outlines how reconstruction output from full simulations can be used for analysis by physics working groups.
At the end of this tutorial you should able to:
- determine the location of relevant full simulation reconstruction files,
- understand the file structure of the full simulation reconstruction output,
- use one of several approaches to perform low-level analysis on the reconstruction output.
Location of full simulation reconstruction output
Full simulation outputs are stored on one of several locations:
- S3: https://dtn01.sdcc.bnl.gov:9000/minio/eictest/ATHENA/ (web interface)
- xrootd: root://sci-xrootd.jlab.org//osgpool/eic/ATHENA/ (no preview currently available)
Data at these location is organized in a predicatable structure (though not all directories may be present):
EVGEN/
contains all initial generated events (in hepmc3 or other format),FULL/
contains the raw full simulation output without any reconstruction,RECO/
contains the reconstruction output.
Under the different top-level directories, you can find the different geometry versions (EVGEN/
files are geometry-independent and are missing this level):
master/
always points to a relatively recent geometry in the ATHENA master branch,acadia-v1.0-stable/
points to the first baseline geometry,- etc...
Under the geometry versions, you will find the physics processes:
SINGLE/
contains single particle initial states,DIS/
contains DIS events,EXCLUSIVE/
contains EXCLUSIVE events,- etc...
For details on accessing these locations, please refer to https://doc.athena-eic.org/en/latest/howto/.
For the purpose of this tutorial we will use the S3 interface since it does not rely on a mirroring process at Jefferson Lab (which happens on a 4-hourly basis, at 8am EDT and so on).
File structure of the full simulation reconstruction output
Each reconstruction output file has essentially the same structure, defined by the EIC Data Model. This structure can be retrieved with rootls
, e.g.
root -l s3https://dtn01.sdcc.bnl.gov:9000/eictest/ATHENA/RECO/master/SINGLE/pi+/1GeV/45to135deg/pi+_1GeV_45to135deg.0001.root
which produces the following output, two trees:
events metadata
You may encounter some (or many) warnings when you run this on a regular ROOT installation, but the ROOT files are built in such a way that they only use basic ("plain old data") types that ROOT can interpret without any helper classes. There are helper classes available inside the ATHENA container.
The events
tree is of course what we are interested in. We can explore its top-level structure as follows. We first start a ROOT session:
root -l s3https://dtn01.sdcc.bnl.gov:9000/eictest/ATHENA/RECO/master/SINGLE/pi+/1GeV/45to135deg/pi+_1GeV_45to135deg.0001.root
and then list the top-level branches in the events
tree:
events->Print("toponly")
This will display a large number of branches (and their size):
*******************************************************************************Tree :events : Events tree **Entries : 1002 : Total = 44146720 bytes File Size = 14534550 ** : : Tree compression factor = 3.01 *******************************************************************************branch: mcparticles2 117074branch: TrackerBarrelHits2 352204branch: ReconstructedParticles 3763branch: EcalBarrelHitsSimpleDigi 132444branch: EcalBarrelHitsSimpleReco 162677branch: EcalEndcapNHitsDigi 53267branch: EcalEndcapNHitsReco 18557branch: EcalEndcapNProtoClusters 2247branch: EcalEndcapNClusters 5389branch: outputInfoCollection 2744branch: EcalEndcapPHitsDigi 79927branch: EcalEndcapPHitsReco 57765branch: EcalEndcapPHitsRecoXY 53667branch: EcalEndcapPProtoClusters 2439branch: EcalEndcapPClusters 5508branch: EcalEndcapPClustersInfo 2833branch: EcalBarrelHitsDigi 130174branch: EcalBarrelHitsReco 297904branch: EcalBarrelProtoClusters 3416branch: EcalBarrelLayers 8672branch: EcalBarrelClusters 9396branch: EcalBarrelClustersInfo 5286branch: EcalBarrelScFiHitsDigi 5345203branch: EcalBarrelScFiHitsReco 5554064branch: EcalBarrelScFiGridReco 1328000branch: EcalBarrelScFiProtoClusters 67285branch: EcalBarrelScFiClusters 36508branch: EcalBarrelScFiClustersInfo 19039branch: HcalBarrelHitsDigi 54355branch: HcalBarrelHitsReco 44131branch: HcalBarrelHitsRecoXY 36646branch: HcalBarrelProtoClusters 1935branch: HcalBarrelClusters 4251branch: HcalBarrelClustersInfo 2269branch: HcalEndcapNHitsDigi 11722branch: HcalEndcapNHitsReco 13786branch: HcalEndcapNHitsRecoXY 11962branch: HcalEndcapNProtoClusters 1903branch: HcalEndcapNClusters 4115branch: HcalEndcapNClustersInfo 2215branch: HcalEndcapPHitsDigi 7388branch: HcalEndcapPHitsReco 8158branch: HcalEndcapPHitsRecoXY 7660branch: HcalEndcapPProtoClusters 1903branch: HcalEndcapPClusters 4115branch: HcalEndcapPClustersInfo 2215branch: TrackerBarrelRawHits 51559branch: TrackerEndcapRawHits 3664branch: VertexBarrelRawHits 27933branch: VertexEndcapRawHits 2283branch: TrackerBarrelRecHits 148096branch: TrackerEndcapRecHits 7560branch: VertexBarrelRecHits 61902branch: VertexEndcapRecHits 3984branch: ReconstructedParticlesInitFromTruth 3277branch: outputTrackParameters 3519branch: DRICHHits2 8688branch: DRICHHitsDigi 1998branch: DRICHHitsReco 3525branch: EcalEndcapNHits 3717branch: EcalEndcapPHits 3717branch: EcalBarrelHits 3701branch: EcalBarrelScFiHits 3760branch: HcalBarrelHits 3701branch: HcalEndcapPHits 3717branch: HcalEndcapNHits 3717branch: TrackerEndcapHits 4372branch: TrackerBarrelHits 4372branch: VertexBarrelHits 4352branch: VertexEndcapHits 4352branch: DRICHHits 4217
During the development of the reconstruction, there are more branches enabled here than are strictly necessary (e.g. simulated and digitized hits, intermediate reconstruction parameters). These are all available for analysis (with fixed interfaces). In this tutorial we will focus on a few branches in particular:
- ReconstructedParticles: contains the results from track finding and fitting,
- EcalBarrelImagingClusters: contains the results from the barrel Imaging Ecal cluster finding,
- EcalBarrelScFiClusters: contains the results from the barrel ScFi Ecal cluster finding.
We can inspect each of these three branches in more detail (some information removed for formatting)
root [14] events->Print("ReconstructedParticles*")*******************************************************************************Tree :events : Events tree **Entries : 1002 : Total = 44146720 bytes File Size = 14534550 ** : : Tree compression factor = 3.01 ********************************************************************************Br 0 :ReconstructedParticles : Int_t ReconstructedParticles_ **Br 1 :ReconstructedParticles.ID.value : Int_t value[ReconstructedParticles_] **Br 2 :ReconstructedParticles.p.x : Float_t x[ReconstructedParticles_] **Br 3 :ReconstructedParticles.p.y : Float_t y[ReconstructedParticles_] **Br 4 :ReconstructedParticles.p.z : Float_t z[ReconstructedParticles_] **Br 5 :ReconstructedParticles.v.x : Float_t x[ReconstructedParticles_] **Br 6 :ReconstructedParticles.v.y : Float_t y[ReconstructedParticles_] **Br 7 :ReconstructedParticles.v.z : Float_t z[ReconstructedParticles_] **Br 8 :ReconstructedParticles.time : Float_t time[ReconstructedParticles_]**Br 9 :ReconstructedParticles.pid : Int_t pid[ReconstructedParticles_] **Br 10 :ReconstructedParticles.status : Short_t status[ReconstructedParticles_] **Br 11 :ReconstructedParticles.charge : Short_t charge[ReconstructedParticles_] **Br 12 :ReconstructedParticles.momentum : Float_t momentum[ReconstructedParticles_] **Br 13 :ReconstructedParticles.energy : Float_t energy[ReconstructedParticles_] **Br 14 :ReconstructedParticles.mass : Float_t mass[ReconstructedParticles_]**Br 15 :ReconstructedParticles.weight.value : ** | Float_t value[ReconstructedParticles_] **............................................................................*root [6] events->Print("EcalBarrelScFiClusters*")*******************************************************************************Tree :events : Events tree **Entries : 1002 : Total = 44146720 bytes File Size = 14534550 ** : : Tree compression factor = 3.01 ********************************************************************************Br 0 :EcalBarrelScFiClusters : Int_t EcalBarrelScFiClusters_ **Br 1 :EcalBarrelScFiClusters.ID.value : ** | Int_t value[EcalBarrelScFiClusters_] **Br 2 :EcalBarrelScFiClusters.type : Int_t type[EcalBarrelScFiClusters_] **Br 3 :EcalBarrelScFiClusters.energy : ** | Float_t energy[EcalBarrelScFiClusters_] **Br 4 :EcalBarrelScFiClusters.energyError : ** | Float_t energyError[EcalBarrelScFiClusters_] **Br 5 :EcalBarrelScFiClusters.time : Float_t time[EcalBarrelScFiClusters_]**Br 6 :EcalBarrelScFiClusters.nhits : ** | UInt_t nhits[EcalBarrelScFiClusters_] **Br 7 :EcalBarrelScFiClusters.position.x : ** | Float_t x[EcalBarrelScFiClusters_] **Br 8 :EcalBarrelScFiClusters.position.y : ** | Float_t y[EcalBarrelScFiClusters_] **Br 9 :EcalBarrelScFiClusters.position.z : ** | Float_t z[EcalBarrelScFiClusters_] **Br 10 :EcalBarrelScFiClusters.positionError.xx : ** | Float_t xx[EcalBarrelScFiClusters_] **Br 11 :EcalBarrelScFiClusters.positionError.yy : ** | Float_t yy[EcalBarrelScFiClusters_] **Br 12 :EcalBarrelScFiClusters.positionError.zz : ** | Float_t zz[EcalBarrelScFiClusters_] **Br 13 :EcalBarrelScFiClusters.positionError.xy : ** | Float_t xy[EcalBarrelScFiClusters_] **Br 14 :EcalBarrelScFiClusters.positionError.xz : ** | Float_t xz[EcalBarrelScFiClusters_] **Br 15 :EcalBarrelScFiClusters.positionError.yz : ** | Float_t yz[EcalBarrelScFiClusters_] **Br 16 :EcalBarrelScFiClusters.radius : ** | Float_t radius[EcalBarrelScFiClusters_] **Br 17 :EcalBarrelScFiClusters.skewness : ** | Float_t skewness[EcalBarrelScFiClusters_] **Br 18 :EcalBarrelScFiClustersInfo : Int_t EcalBarrelScFiClustersInfo_ **Br 19 :EcalBarrelScFiClustersInfo.clusterID.value : ** | Int_t value[EcalBarrelScFiClustersInfo_] **Br 20 :EcalBarrelScFiClustersInfo.polar.r : ** | Float_t r[EcalBarrelScFiClustersInfo_] **Br 21 :EcalBarrelScFiClustersInfo.polar.theta : ** | Float_t theta[EcalBarrelScFiClustersInfo_] **Br 22 :EcalBarrelScFiClustersInfo.polar.phi : ** | Float_t phi[EcalBarrelScFiClustersInfo_] **Br 23 :EcalBarrelScFiClustersInfo.eta : ** | Float_t eta[EcalBarrelScFiClustersInfo_] **............................................................................*
The ReconstructedParticles branch contains the momentum, e.g. ReconstructedParticles.p.x
for the x-component, and references to other entities related to this particle (clusters, tracks, particles).
Analysis of full simulation reconstruction output with traditional ROOT commands
After opening the reconstruction output file, let's make some pretty plots. We start with:
root -l s3https://dtn01.sdcc.bnl.gov:9000/eictest/ATHENA/RECO/SINGLE/pi+/1GeV/45to135deg/pi+_1GeV_45to135deg.0001.root
and in the ROOT session:
root [1] .lsTS3WebFile** s3https://dtn01.sdcc.bnl.gov:9000/eictest/ATHENA/RECO/master/SINGLE/pi+/1GeV/45to135deg/pi+_1GeV_45to135deg.0001.root data fileTS3WebFile* s3https://dtn01.sdcc.bnl.gov:9000/eictest/ATHENA/RECO/master/SINGLE/pi+/1GeV/45to135deg/pi+_1GeV_45to135deg.0001.root data fileKEY: TTree events;1 Events treeKEY: TTree metadata;1 Metadata treeevents->Draw("ReconstructedParticles.p.x")events->Draw("EcalBarrelImagingClustersInfo.polar.phi:EcalBarrelImagingClustersInfo.polar.theta", "EcalBarrelScFiClusters.energy", "colz")
These types of plots will likely be limited to simple data inspection.
Analysis of full simulation reconstruction output with RDataFrame commands
For a more advanced analysis, you can take advantage of the RDataFrame features, such as in this (shortened) DIS example. The original is used in our CI system and determines DIS parameters for every change to the detector geometry, simulation, digitization, or reconstruction.
Edit this page on eicweb
auto momenta_from_reconstruction(const std::vector<eic::ReconstructedParticleData>& parts) {std::vector<ROOT::Math::PxPyPzEVector> momenta{parts.size()};std::transform(parts.begin(), parts.end(), momenta.begin(), [](const auto& part) {return ROOT::Math::PxPyPzEVector{part.p.x, part.p.y, part.p.z, part.energy};});return momenta;}auto convertMtoE(const std::vector<ROOT::Math::PxPyPzMVector>& mom) {std::vector<ROOT::Math::PxPyPzEVector> momenta{mom.size()};std::transform(mom.begin(), mom.end(), momenta.begin(), [](const auto& part) {return ROOT::Math::PxPyPzEVector{part.Px(), part.Py(), part.Pz(), part.energy()};});return momenta;}bool sort_mom_bool(ROOT::Math::PxPyPzEVector &mom1, ROOT::Math::PxPyPzEVector &mom2) {return mom1.energy() > mom2.energy();}auto sort_momenta(const std::vector<ROOT::Math::PxPyPzEVector>& mom) {std::vector <ROOT::Math::PxPyPzEVector> sort_mom = mom;sort(sort_mom.begin(), sort_mom.end(), sort_mom_bool);return sort_mom;}auto Q2(const std::vector<ROOT::Math::PxPyPzEVector>& mom) {std::vector<double> Q2Vec(mom.size() );ROOT::Math::PxPyPzEVector beamMom = {0, 0, 5, 5};std::transform(mom.begin(), mom.end(), Q2Vec.begin(), [beamMom](const auto& part) {return -(part - beamMom).M2();});return Q2Vec;}ROOT::RDataFrame d("events", "s3https://dtn01.sdcc.bnl.gov:9000/eictest/ATHENA/RECO/master/DIS/NC/5x41/minQ2=1/pythia8NCDIS_5x41_minQ2=1_beamEffects_xAngle=-0.025_hiDiv_1.0001.root");auto d0 = d.Define("p", momenta_from_reconstruction, {"ReconstructedParticles"}).Define("Q2", Q2, {"p"});auto h_Q2_sim = d0.Histo1D({"h_Q2_sim", "; GeV; counts", 100, -5, 25}, "Q2");auto& h1_Q2_sim = *h_Q2_sim;h1_Q2_sim.DrawClone("hist");