EIC software

Reconstruction Analysis

This part of the tutorial outlines how reconstruction output from full simulations can be used for analysis by physics working groups.

At the end of this tutorial you should able to:

  • determine the location of relevant full simulation reconstruction files,
  • understand the file structure of the full simulation reconstruction output,
  • use one of several approaches to perform low-level analysis on the reconstruction output.

Location of full simulation reconstruction output

Full simulation outputs are stored on one of several locations:

Data at these location is organized in a predicatable structure (though not all directories may be present):

  • EVGEN/ contains all initial generated events (in hepmc3 or other format),
  • FULL/ contains the raw full simulation output without any reconstruction,
  • RECO/ contains the reconstruction output.

Under the different top-level directories, you can find the different geometry versions (EVGEN/ files are geometry-independent and are missing this level):

  • master/ always points to a relatively recent geometry in the ATHENA master branch,
  • acadia-v1.0-stable/ points to the first baseline geometry,
  • etc...

Under the geometry versions, you will find the physics processes:

  • SINGLE/ contains single particle initial states,
  • DIS/ contains DIS events,
  • EXCLUSIVE/ contains EXCLUSIVE events,
  • etc...

For details on accessing these locations, please refer to https://doc.athena-eic.org/en/latest/howto/.

For the purpose of this tutorial we will use the S3 interface since it does not rely on a mirroring process at Jefferson Lab (which happens on a 4-hourly basis, at 8am EDT and so on).

File structure of the full simulation reconstruction output

Each reconstruction output file has essentially the same structure, defined by the EIC Data Model. This structure can be retrieved with rootls, e.g.

root -l s3https://dtn01.sdcc.bnl.gov:9000/eictest/ATHENA/RECO/master/SINGLE/pi+/1GeV/45to135deg/pi+_1GeV_45to135deg.0001.root

which produces the following output, two trees:

events metadata

You may encounter some (or many) warnings when you run this on a regular ROOT installation, but the ROOT files are built in such a way that they only use basic ("plain old data") types that ROOT can interpret without any helper classes. There are helper classes available inside the ATHENA container.

The events tree is of course what we are interested in. We can explore its top-level structure as follows. We first start a ROOT session:

root -l s3https://dtn01.sdcc.bnl.gov:9000/eictest/ATHENA/RECO/master/SINGLE/pi+/1GeV/45to135deg/pi+_1GeV_45to135deg.0001.root

and then list the top-level branches in the events tree:

events->Print("toponly")

This will display a large number of branches (and their size):

******************************************************************************
*Tree :events : Events tree *
*Entries : 1002 : Total = 44146720 bytes File Size = 14534550 *
* : : Tree compression factor = 3.01 *
******************************************************************************
branch: mcparticles2 117074
branch: TrackerBarrelHits2 352204
branch: ReconstructedParticles 3763
branch: EcalBarrelHitsSimpleDigi 132444
branch: EcalBarrelHitsSimpleReco 162677
branch: EcalEndcapNHitsDigi 53267
branch: EcalEndcapNHitsReco 18557
branch: EcalEndcapNProtoClusters 2247
branch: EcalEndcapNClusters 5389
branch: outputInfoCollection 2744
branch: EcalEndcapPHitsDigi 79927
branch: EcalEndcapPHitsReco 57765
branch: EcalEndcapPHitsRecoXY 53667
branch: EcalEndcapPProtoClusters 2439
branch: EcalEndcapPClusters 5508
branch: EcalEndcapPClustersInfo 2833
branch: EcalBarrelHitsDigi 130174
branch: EcalBarrelHitsReco 297904
branch: EcalBarrelProtoClusters 3416
branch: EcalBarrelLayers 8672
branch: EcalBarrelClusters 9396
branch: EcalBarrelClustersInfo 5286
branch: EcalBarrelScFiHitsDigi 5345203
branch: EcalBarrelScFiHitsReco 5554064
branch: EcalBarrelScFiGridReco 1328000
branch: EcalBarrelScFiProtoClusters 67285
branch: EcalBarrelScFiClusters 36508
branch: EcalBarrelScFiClustersInfo 19039
branch: HcalBarrelHitsDigi 54355
branch: HcalBarrelHitsReco 44131
branch: HcalBarrelHitsRecoXY 36646
branch: HcalBarrelProtoClusters 1935
branch: HcalBarrelClusters 4251
branch: HcalBarrelClustersInfo 2269
branch: HcalEndcapNHitsDigi 11722
branch: HcalEndcapNHitsReco 13786
branch: HcalEndcapNHitsRecoXY 11962
branch: HcalEndcapNProtoClusters 1903
branch: HcalEndcapNClusters 4115
branch: HcalEndcapNClustersInfo 2215
branch: HcalEndcapPHitsDigi 7388
branch: HcalEndcapPHitsReco 8158
branch: HcalEndcapPHitsRecoXY 7660
branch: HcalEndcapPProtoClusters 1903
branch: HcalEndcapPClusters 4115
branch: HcalEndcapPClustersInfo 2215
branch: TrackerBarrelRawHits 51559
branch: TrackerEndcapRawHits 3664
branch: VertexBarrelRawHits 27933
branch: VertexEndcapRawHits 2283
branch: TrackerBarrelRecHits 148096
branch: TrackerEndcapRecHits 7560
branch: VertexBarrelRecHits 61902
branch: VertexEndcapRecHits 3984
branch: ReconstructedParticlesInitFromTruth 3277
branch: outputTrackParameters 3519
branch: DRICHHits2 8688
branch: DRICHHitsDigi 1998
branch: DRICHHitsReco 3525
branch: EcalEndcapNHits 3717
branch: EcalEndcapPHits 3717
branch: EcalBarrelHits 3701
branch: EcalBarrelScFiHits 3760
branch: HcalBarrelHits 3701
branch: HcalEndcapPHits 3717
branch: HcalEndcapNHits 3717
branch: TrackerEndcapHits 4372
branch: TrackerBarrelHits 4372
branch: VertexBarrelHits 4352
branch: VertexEndcapHits 4352
branch: DRICHHits 4217

During the development of the reconstruction, there are more branches enabled here than are strictly necessary (e.g. simulated and digitized hits, intermediate reconstruction parameters). These are all available for analysis (with fixed interfaces). In this tutorial we will focus on a few branches in particular:

  • ReconstructedParticles: contains the results from track finding and fitting,
  • EcalBarrelImagingClusters: contains the results from the barrel Imaging Ecal cluster finding,
  • EcalBarrelScFiClusters: contains the results from the barrel ScFi Ecal cluster finding.

We can inspect each of these three branches in more detail (some information removed for formatting)

root [14] events->Print("ReconstructedParticles*")
******************************************************************************
*Tree :events : Events tree *
*Entries : 1002 : Total = 44146720 bytes File Size = 14534550 *
* : : Tree compression factor = 3.01 *
******************************************************************************
*Br 0 :ReconstructedParticles : Int_t ReconstructedParticles_ *
*Br 1 :ReconstructedParticles.ID.value : Int_t value[ReconstructedParticles_] *
*Br 2 :ReconstructedParticles.p.x : Float_t x[ReconstructedParticles_] *
*Br 3 :ReconstructedParticles.p.y : Float_t y[ReconstructedParticles_] *
*Br 4 :ReconstructedParticles.p.z : Float_t z[ReconstructedParticles_] *
*Br 5 :ReconstructedParticles.v.x : Float_t x[ReconstructedParticles_] *
*Br 6 :ReconstructedParticles.v.y : Float_t y[ReconstructedParticles_] *
*Br 7 :ReconstructedParticles.v.z : Float_t z[ReconstructedParticles_] *
*Br 8 :ReconstructedParticles.time : Float_t time[ReconstructedParticles_]*
*Br 9 :ReconstructedParticles.pid : Int_t pid[ReconstructedParticles_] *
*Br 10 :ReconstructedParticles.status : Short_t status[ReconstructedParticles_] *
*Br 11 :ReconstructedParticles.charge : Short_t charge[ReconstructedParticles_] *
*Br 12 :ReconstructedParticles.momentum : Float_t momentum[ReconstructedParticles_] *
*Br 13 :ReconstructedParticles.energy : Float_t energy[ReconstructedParticles_] *
*Br 14 :ReconstructedParticles.mass : Float_t mass[ReconstructedParticles_]*
*Br 15 :ReconstructedParticles.weight.value : *
* | Float_t value[ReconstructedParticles_] *
*............................................................................*
root [6] events->Print("EcalBarrelScFiClusters*")
******************************************************************************
*Tree :events : Events tree *
*Entries : 1002 : Total = 44146720 bytes File Size = 14534550 *
* : : Tree compression factor = 3.01 *
******************************************************************************
*Br 0 :EcalBarrelScFiClusters : Int_t EcalBarrelScFiClusters_ *
*Br 1 :EcalBarrelScFiClusters.ID.value : *
* | Int_t value[EcalBarrelScFiClusters_] *
*Br 2 :EcalBarrelScFiClusters.type : Int_t type[EcalBarrelScFiClusters_] *
*Br 3 :EcalBarrelScFiClusters.energy : *
* | Float_t energy[EcalBarrelScFiClusters_] *
*Br 4 :EcalBarrelScFiClusters.energyError : *
* | Float_t energyError[EcalBarrelScFiClusters_] *
*Br 5 :EcalBarrelScFiClusters.time : Float_t time[EcalBarrelScFiClusters_]*
*Br 6 :EcalBarrelScFiClusters.nhits : *
* | UInt_t nhits[EcalBarrelScFiClusters_] *
*Br 7 :EcalBarrelScFiClusters.position.x : *
* | Float_t x[EcalBarrelScFiClusters_] *
*Br 8 :EcalBarrelScFiClusters.position.y : *
* | Float_t y[EcalBarrelScFiClusters_] *
*Br 9 :EcalBarrelScFiClusters.position.z : *
* | Float_t z[EcalBarrelScFiClusters_] *
*Br 10 :EcalBarrelScFiClusters.positionError.xx : *
* | Float_t xx[EcalBarrelScFiClusters_] *
*Br 11 :EcalBarrelScFiClusters.positionError.yy : *
* | Float_t yy[EcalBarrelScFiClusters_] *
*Br 12 :EcalBarrelScFiClusters.positionError.zz : *
* | Float_t zz[EcalBarrelScFiClusters_] *
*Br 13 :EcalBarrelScFiClusters.positionError.xy : *
* | Float_t xy[EcalBarrelScFiClusters_] *
*Br 14 :EcalBarrelScFiClusters.positionError.xz : *
* | Float_t xz[EcalBarrelScFiClusters_] *
*Br 15 :EcalBarrelScFiClusters.positionError.yz : *
* | Float_t yz[EcalBarrelScFiClusters_] *
*Br 16 :EcalBarrelScFiClusters.radius : *
* | Float_t radius[EcalBarrelScFiClusters_] *
*Br 17 :EcalBarrelScFiClusters.skewness : *
* | Float_t skewness[EcalBarrelScFiClusters_] *
*Br 18 :EcalBarrelScFiClustersInfo : Int_t EcalBarrelScFiClustersInfo_ *
*Br 19 :EcalBarrelScFiClustersInfo.clusterID.value : *
* | Int_t value[EcalBarrelScFiClustersInfo_] *
*Br 20 :EcalBarrelScFiClustersInfo.polar.r : *
* | Float_t r[EcalBarrelScFiClustersInfo_] *
*Br 21 :EcalBarrelScFiClustersInfo.polar.theta : *
* | Float_t theta[EcalBarrelScFiClustersInfo_] *
*Br 22 :EcalBarrelScFiClustersInfo.polar.phi : *
* | Float_t phi[EcalBarrelScFiClustersInfo_] *
*Br 23 :EcalBarrelScFiClustersInfo.eta : *
* | Float_t eta[EcalBarrelScFiClustersInfo_] *
*............................................................................*

The ReconstructedParticles branch contains the momentum, e.g. ReconstructedParticles.p.x for the x-component, and references to other entities related to this particle (clusters, tracks, particles).

Analysis of full simulation reconstruction output with traditional ROOT commands

After opening the reconstruction output file, let's make some pretty plots. We start with:

root -l s3https://dtn01.sdcc.bnl.gov:9000/eictest/ATHENA/RECO/SINGLE/pi+/1GeV/45to135deg/pi+_1GeV_45to135deg.0001.root

and in the ROOT session:

root [1] .ls
TS3WebFile** s3https://dtn01.sdcc.bnl.gov:9000/eictest/ATHENA/RECO/master/SINGLE/pi+/1GeV/45to135deg/pi+_1GeV_45to135deg.0001.root data file
TS3WebFile* s3https://dtn01.sdcc.bnl.gov:9000/eictest/ATHENA/RECO/master/SINGLE/pi+/1GeV/45to135deg/pi+_1GeV_45to135deg.0001.root data file
KEY: TTree events;1 Events tree
KEY: TTree metadata;1 Metadata tree
events->Draw("ReconstructedParticles.p.x")
events->Draw("EcalBarrelImagingClustersInfo.polar.phi:EcalBarrelImagingClustersInfo.polar.theta", "EcalBarrelScFiClusters.energy", "colz")

These types of plots will likely be limited to simple data inspection.

Analysis of full simulation reconstruction output with RDataFrame commands

For a more advanced analysis, you can take advantage of the RDataFrame features, such as in this (shortened) DIS example. The original is used in our CI system and determines DIS parameters for every change to the detector geometry, simulation, digitization, or reconstruction.

auto momenta_from_reconstruction(const std::vector<eic::ReconstructedParticleData>& parts) {
std::vector<ROOT::Math::PxPyPzEVector> momenta{parts.size()};
std::transform(parts.begin(), parts.end(), momenta.begin(), [](const auto& part) {
return ROOT::Math::PxPyPzEVector{part.p.x, part.p.y, part.p.z, part.energy};
});
return momenta;
}
auto convertMtoE(const std::vector<ROOT::Math::PxPyPzMVector>& mom) {
std::vector<ROOT::Math::PxPyPzEVector> momenta{mom.size()};
std::transform(mom.begin(), mom.end(), momenta.begin(), [](const auto& part) {
return ROOT::Math::PxPyPzEVector{part.Px(), part.Py(), part.Pz(), part.energy()};
});
return momenta;
}
bool sort_mom_bool(ROOT::Math::PxPyPzEVector &mom1, ROOT::Math::PxPyPzEVector &mom2) {
return mom1.energy() > mom2.energy();
}
auto sort_momenta(const std::vector<ROOT::Math::PxPyPzEVector>& mom) {
std::vector <ROOT::Math::PxPyPzEVector> sort_mom = mom;
sort(sort_mom.begin(), sort_mom.end(), sort_mom_bool);
return sort_mom;
}
auto Q2(const std::vector<ROOT::Math::PxPyPzEVector>& mom) {
std::vector<double> Q2Vec(mom.size() );
ROOT::Math::PxPyPzEVector beamMom = {0, 0, 5, 5};
std::transform(mom.begin(), mom.end(), Q2Vec.begin(), [beamMom](const auto& part) {
return -(part - beamMom).M2();
});
return Q2Vec;
}
ROOT::RDataFrame d("events", "s3https://dtn01.sdcc.bnl.gov:9000/eictest/ATHENA/RECO/master/DIS/NC/5x41/minQ2=1/pythia8NCDIS_5x41_minQ2=1_beamEffects_xAngle=-0.025_hiDiv_1.0001.root");
auto d0 = d.Define("p", momenta_from_reconstruction, {"ReconstructedParticles"}).Define("Q2", Q2, {"p"});
auto h_Q2_sim = d0.Histo1D({"h_Q2_sim", "; GeV; counts", 100, -5, 25}, "Q2");
auto& h1_Q2_sim = *h_Q2_sim;
h1_Q2_sim.DrawClone("hist");
Edit this page on eicweb