Introduction
Overview
Teaching: 5 min
Exercises: 0 minQuestions
Objectives
Define physics goal of new reconstruction algorithm
Identify what information is needed to accomplish this goal
Background
The simulation-reconstruction-analysis pipeline
The ePIC software pipeline consists of the following stages.
- Event generation
- Simulation
- Digitization (eventually)
- Reconstruction
- Analysis
This tutorial focuses exclusively on the reconstruction stage and the EICrecon software package. At the highest conceptual level, EICrecon’s inputs are simulated (and eventually, optionally digitized) detector hits, and the outputs are reconstructed tracks and particle identifications. Internally, EICrecon uses a data model and software architecture which is designed to cleanly separate dozens of different reconstruction algorithms and facilitate reconfiguring them and reusing them according to users’ needs.
Levels of user interaction with reconstruction
Depending on how a user contributes to ePIC, their level of interaction with EICrecon will vary. We identify four distinct patterns, or levels of interaction:
- Performing analyses using the reconstructed data as-is
- Modifying an algorithm’s parameters and re-running the reconstruction
- Applying an existing algorithm to a fresh context
- Adding a new algorithm
This tutorial is designed to be most helpful for users doing (3) or (4), although users at all levels will benefit from the deeper understanding of EICrecon’s architecture.
Motivating example: simple electron ID with E/p cut
One key ingredient in electron ID is the ratio of the energy deposited in the calorimeter (E) to the momentum of the particle track (p). For electrons, this should be close to 1. We want to implement a rudimentary electron ID algorithm by identifying particles that satisfy 0.9 < E/p < 1.2.
What information is required?
This simple electron ID algorithm requires three pieces of information, which will be obtained from pre-existing algorithms/factories:
- Reconstructed tracks
- Calorimeter clusters
- Matching between tracks and clusters
Matching between tracks and clusters can be obtained from truth information, or from track projections.
- In the former case, the reconstructed tracks and clusters are matched through their association to the same MC particle.
- In the latter case, a matching criteria is applied to the position of the calorimeter cluster and the projected position of the track at the calorimeter.
For the purposes of this tutorial, we will use truth matching.
The input and output objects of our factory should be stored as PODIO collections.
- Reconstructed tracks are stored as a collection of
edm4eic::ReconstructedParticle
- Calorimeter clusters are stored as a collection of
edm4eic::Cluster
- Associations between reconstructed tracks and MC particles are stored as a collection of
edm4eic::MCRecoParticleAssociation
- Associations between calorimeter clusters and MC particles are stored as a collection of
edm4eic:MCRecoClusterParticleAssociation
The members of each of these data types can be found in edm4eic.yaml
in the EDM4eic
repository.
Key Points
Creating a factory
Overview
Teaching: 10 min
Exercises: 1 minQuestions
Objectives
Understand the basics of EICrecon’s plugin structure
Understand where to put new factories
Understand which factory base class to use
Understand the JOmniFactory interface
Algorithms and Factories
We make a crisp distinction between algorithms and factories.
Algorithms are classes that perform one kind of calculation we need and they do so in a generic, framework-independent way. Factories, in comparison, are classes that attach Algorithms to the JANA framework. You should write Algorithms to be independently testable, and you should write a Factory so that JANA can use the Algorithm within EICrecon. The factory layer handles issues like obtaining all of the inputs from other factories, publishing the outputs so that other factories can use them or they can be written to file, obtaining the correct parameters, making sure the Algorithm has been initialized, and making sure that the correct calibrations are loaded when the run number changes.
Here’s an example to help illustrate what goes into the Algorithm and what goes into the Factory. Consider calorimeter clustering. The clustering algorithm should be independent of any individual detector, and have a set of parameters that control its behavior and live in a plain-old-data Config
object. You could copy-paste this code into a codebase that uses a completely different reconstruction framework and it would still work, as long as you were using the same datamodel (e.g. edm4hep
). Each detector could have its own factory (if it calls the algorithm in a substantially different way) or they may all use the same factory (if the factories only differ in their parameter values). The parameter values themselves could be hardcoded to the factory, but we strongly prefer to set them externally using a factory generator. This gives us a cleaner separation of configuration from code, and will let us do fun things in the future such as wiring factories together from an external config file, or performing parameter studies.
The basics of EICrecon’s plugin structure
JANA plugins are a mechanism for controlling which parts of EICrecon
get compiled and linked together. They give us the ability to avoid having to compile and link heavy dependencies that not everybody will be using all the time. For instance, by default EICrecon uses ACTS for tracking, but perhaps someone wants to benchmark ACTS against Genfit – we wouldn’t want to have to ship Genfit inside eic-shell all the time.
Plugins were also designed so that users could integrate their analyses directly into reconstruction while keeping them independent and optional. This pattern is heavily used in the GlueX experiment and recommended in the tutorials on JANA’s own documentation. In EICrecon, we set up separate plugins for each detector and each benchmark, but not for each analysis. We strongly recommend following the advice given in the analysis tutorials instead. The instructions for adding a new plugin are here.
Where to put new factories
The EICrecon plugins are organized as follows. Under src/detectors
we have subdirectories for each individual detector, and each of them corresponds to one plugin that adds the detector’s factory generators. Benchmarks are analogous. If an algorithm/factory will only ever be used in that one context, it can live there; otherwise, and preferably, the corresponding algorithm lives under src/algorithms
and the corresponding factory lives under src/factories
.
Once you figure out which plugin your algorithm naturally belongs to, find its InitPlugin()
method. By convention this lives in a .cc
file with the same name as the plugin itself. This is where you will add your factory generator.
Which factory base class to use
There are a number of different kinds of factories available in JANA which we have used within EICrecon at different points in time. Luckily, if you are writing an Algorithm from scratch, there is only one you will need to be familiar with: JOmniFactory
. However, some of the earlier ones are still around, and just in case you need to modify or reuse those, here is a quick history lesson.
JFactoryT<T>
is JANA’s fundamental factory base class. However, we don’t use it in EICrecon because it has the following limitations:
-
It has difficulty with PODIO data. PODIO data needs very special handling, otherwise it will leak memory or corrupt your object associations. To address this, we developed
JFactoryPodioT
, which extendsJFactoryT
to support PODIO correctly. -
It can only output one collection. This might seem fine at first, but frequently we need to output “Association” collections alongside the primary output collection. To address this, we developed
JMultifactory
, which supports multiple outputs, including PODIO data. -
If you want to reuse an Algorithm in a different context, you need to duplicate the JFactoryT/JPodioFactoryT/JMultifactory. Until this point, collection and parameter names were hardcoded inside individual factories. To get around this, we developed
JChainMultifactoryT
so that we could create multiple instances of the same factory and assign them different collection and parameter names in a logical way. -
It requires a deeper understanding of JANA internals to use correctly. The user is allowed to perform actions inside the factory callbacks that don’t necessarily make sense. We remedied this issue by developing
JOmniFactory
, which declares what it needs upfront, and JANA provides it only when it makes sense.JOmniFactory
supports all of the functionality developed for points (1), (2), and (3), and presents a simpler interface.
In summary, always use JOmniFactory
if you are writing something new. All existing factories in EICrecon are in the process of being migrated right now: https://github.com/eic/EICrecon/issues/1176.
The JOmniFactory interface
The basic idea behind an OmniFactory is to declare what you need upfront. That way, the framework can retrieve everything you need at the right time, and it can handle complex namespacing logic behind the scenes so that you can dynamically rewire and reconfigure factories.
Earlier factory base classes, such as JChainMultifactory, require users to do a lot in their callbacks. Not so with JOmniFactory, which moves most of the functionality into registered members instead, as we shall discuss later. The callbacks are still there, but are made much simple, and focus on satisfying the underlying Algorithm’s needs instead of JANA’s.
These are the callbacks you’ll need to implement:
void Configure();
void ChangeRun(int64_t run_number);
void Process(int64_t run_number, uint64_t event_number);
Configure
is called once when the factory is instantiated. This is where the user should initialize the underlying Algorithm. JANA will have already fetched the services, configured the logger, and set the values of the Config
struct, so all the user needs to do is pass these things to the Algorithm.
ChangeRun
is called once JANA detects that a new run has been started. This is where the user should update calibration data or other resources keyed off of the run number. JOmniFactory also provides a Resource
registered member to automatically retrieve data from an arbitrary Service, though this is still experimental.
Process
is called for every event. (Side note: Although note that because different threads have different factory instances, any individual factory cannot be guaranteed to witness the entire event stream. If you need to have one instance that processes the entire event stream, JANA provides JEventProcessors and JEventProcessorSequential for that purpose.) JANA will have already prefetched the registered Inputs before Process
is called. The user needs to execute the Algorithm using those inputs, and copy the resulting outputs back to the registered Outputs. JANA will then take care of publishing the outputs downstream.
Note that unlike earlier factory base classes, JOmniFactory uses the Curiously Recurring Template Pattern so that the callback methods aren’t virtual. This lets the optimizer get rid of any performance penalty for the extra layer of indirection.
Here is the full JOmniFactory code skeleton:
#pragma once
#include "extensions/jana/JOmniFactory.h"
class ReconstructedElectrons_factory : public JOmniFactory<ReconstructedElectrons_factory> {
private:
// Declare inputs and outputs
// PodioInput<edm4hep::MCParticle> m_in_mc_particles {this, "MCParticles"};
// PodioOutput<edm4eic::ReconstructedParticle> m_out_reco_particles {this};
// Declare parameters
// ParameterRef<double> m_min_energy_over_momentum {this, "minEnergyOverMomentum", config().min_energy_over_momentum};
// Declare services
// Service<DD4hep_service> m_geoSvc {this};
public:
void Configure() {
// This is called when the factory is instantiated.
// Use this callback to make sure the algorithm is configured.
// The logger, parameters, and services have all been fetched before this is called
}
void ChangeRun(int64_t run_number) {
// This is called whenever the run number is changed.
// Use this callback to retrieve state that is keyed off of run number.
}
void Process(int64_t run_number, uint64_t event_number) {
// This is called on every event.
// Use this callback to call your Algorithm using all inputs and outputs
// The inputs will have already been fetched for you at this point.
// m_algo->execute(...);
logger()->debug( "Event {}: Calling Process()", event_number );
}
};
The JOmniFactory inputs and outputs
The user specifies the JOmniFactory’s inputs by declaring PodioInput
or VariationalPodioInput
objects as data members. These are templated on the basic PODIO type (Not the collection type or mutable type or object type or data type), and require the user to pass this
as a constructor argument. These objects immediately register themselves with the factory, so that the factory always knows exactly what data it needs to fetch. To access the data once it has been fetched, the user can call the object’s operator()
, which returns a constant pointer to a PODIO collection of the correct type. For instance, suppose the user declares the data member:
PodioInput<MCParticles> m_particles_in {this};
In this case, the user would access the input data like this:
const MCParticlesCollection* particles_in = m_particles_in();
Of course, for brevity, the user could simply write this instead:
m_particles_out() = smearing_algo->execute( m_particles_in() );
As you have just seen, PodioOutputs are very analogous to PodioInputs.
Exercise:
- Create your own ElectronReconstruction factory from the code skeleton above
- Give your OmniFactory a single output collection
- Have its Process() method produce some log output
- Experiment with giving it different input collections
Key Points
Calling a factory
Overview
Teaching: 10 min
Exercises: 1 minQuestions
Objectives
Learn how to wire a factory using a factory generator
Learn how to call the factory as a once-off
Learn how to call the factory every time
Wiring a factory using a factory generator
Instead of handing over the OmniFactory to JANA directly, we create a JOmniFactoryGeneratorT
which is basically a recipe that JANA uses to create factories later. There are several reasons we need this:
-
JANA is designed to process events in parallel, when enabled. Factories are allowed to cache local state using member variables (e.g. calibrations and resources) Despite this, factories don’t need to be thread-safe, because each factory is only used by one thread at a time. This means that JANA needs to spin up at least one instance of each factory for each thread.
-
We want to be able to spin up separate instances of the same factory class (within the context of a single event and thread), and give them different parameter values and collection names. The way we manage these different instances is by giving each factory instance a unique prefix which will be used to namespace its parameters, inputs, outputs, and logger. This happens at the JFactoryGenerator level.
Here is how you set up a factory generator:
app->Add(new JOmniFactoryGeneratorT<MC2SmearedParticle_factory>(
"GeneratedParticles",
{"MCParticles"},
{"GeneratedParticles"},
app
));
In this example, “GeneratedParticles” is the factory instance’s unique tag, {"MCParticles"}
is the list of input collection names, and {"GeneratedParticles"}
is the list of output collection names. Some observations:
-
If you are only creating one instance of this factory, feel free to use the “primary” output collection name as the factory prefix. (This has to be unique because PODIO collection names have to be unique.)
-
Collection names are positional, so they need to be in the same order as the
PodioInput
andVariationalPodioInput
declarations in the factory. -
Variadic inputs are a little bit interesting: You can have any number of variadic inputs mixed in among the non-variadic inputs, as long as there are the same number of collection names for each variadic input. If this confuses you, just restrict yourself to one variadic input and put it as the very last input, like most programming languages do.
-
When assigning names to collections and
JOmniFactory
prefixes, uniqueness is extremely important! JANA will throw an error if it detects a naming collision, but because of the dynamic plugin loading, some collisions can’t be detected. DO NOT use the same output collection name or prefix in different plugins, and DO NOT rely the plugin loading order to make sure you ran the “correct” factory! To swap out different versions of a factory, change or override the input collection name on the factories and processors downstream.
Where to put factory generators
Factory generators need to be added inside an InitPlugin
for a particular plugin. If the factory generator is specific to one particular detector, it would go in that detector’s plugin. Because electron reconstruction is not, the generator is set up in one of the plugins under src/global
. Because this falls in the category of reconstruction, we put it in src/global/reco/reco.cc
.
Calling the factory
To temporarily include your factory’s outputs in the output file
On the command line, set the podio:output_include_collections
parameter to include your collection names:
eicrecon -Ppodio:output_include_collections=MyNewCollectionName1,MyNewCollectionName2 in.root
To permanently include your factory’s outputs in the output file:
Add your collection name to the output_include_collections
list in src/services/io/podio/JEventProcessorPODIO.cc:44
std::vector<std::string> output_include_collections={
"EventHeader",
"MCParticles",
"CentralTrackingRecHits",
"CentralTrackSeedingResults",
"CentralTrackerMeasurements",
//...
To temporarily use your factory’s outputs as inputs to another factory
eicrecon -Ptargetfactory:InputTags=MyNewCollectionName1,MyNewCollection2 in.root
To permanently use your factory’s outputs as inputs to another factory
Change the collection name in the OmniFactoryGeneratorT
or JChainMultifactoryGeneratorT
:
app->Add(new JOmniFactoryGeneratorT<MC2SmearedParticle_factory>(
"GeneratedParticles",
{"MCParticlesSmeared"}, // <== Used to be "MCParticles"
{"GeneratedParticles"},
app
));
Exercise:
- Create a JOmniFactoryGenerator for your ElectronReconstruction factory
- Give your factory’s output collection a fun name
- Call your factory from the command line and verify that you see its logger output.
- Add it to the
JEventSourcePODIO::output_include_collections
, so that it gets called automatically.- Experiment with multiple factory generators so you can have multiple instances of the same factory
Key Points
Parameterizing a factory
Overview
Teaching: 5 min
Exercises: 0 minQuestions
Objectives
Learn how to set parameters on a factory
Learn how to override factories via a generator
Learn how to override factories via the command line
Learn how to access services from a factory
Setting parameters on a factory
Parameters are also handled using registered members. JOmniFactory provides a Parameter
class template which can hold its own value, but in EICrecon we prefer to use Config structs. Thus JOmniFactory provides ParameterRef
, which stores a reference into the Config object.
ParameterRef<double> m_samplingFraction {this, "samplingFraction", config().sampFrac};
ParameterRef<std::string> m_energyWeight {this, "energyWeight", config().energyWeight};
Parameters are fetched immediately before Init()
is called, so you may access them from any of the callbacks like so:
void Process(int64_t run_number, uint64_t event_number) {
logger()->debug( "Event {}: samplingFraction = {}", event_number, m_samplingFraction() );
}
Because we are using ParameterRefs, we can also access the field the ref points to directly:
void Process(int64_t run_number, uint64_t event_number) {
logger()->debug( "Event {}: samplingFraction = {}", event_number, config().sampFrac );
}
Config objects
We create a plain-old struct to hold our parameters. For now this config struct can live in the same header file as our factory, although eventually it should belong with the algorithm instead.
struct ReconstructedElectrons_config {
double threshold = 0.9;
int bucket_count = 4;
// ...
}
By passing it in to the JOmniFactory base class, we can make it automatically available via the config()
method.
class ReconstructedElectrons_factory : public JOmniFactory<ReconstructedElectrons_factory, ElectronReconstructionConfig> {
...
}
Overriding parameters via a generator
If you use a Config object for your parameters, you can pass it in directly to the factory generator:
app->Add(new JOmniFactoryGeneratorT<BasicTestAlg>(
"FunTest", {"MyHits"}, {"MyClusters"},
{
.threshold = 6.1,
.bucket_count = 22
},
app));
Overriding parameters via the command line
We can override parameters on the command line like so:
eicrecon -PFunTest:threshold=12.0 in.root
Accessing services from a factory
Services are singletons that provide access to resources such as loggers, geometry, magnetic field maps, etc. Services need to be thread-safe because they are shared by different threads. The most relevant service right now is DD4hep_service
. You obtain a service using a registered member like this:
Service<DD4hep_service> m_geoSvc {this};
Oftentimes we want to retrieve a resource from a Service and refresh it whenever the run number changes. OmniFactory provides Resource
for this purpose.
Exercise:
- Give your factory a Config struct
- Give your Config struct some parameters
- Experiment with overriding parameter values in the generator and on the command line.
Key Points
Adding an algorithm
Overview
Teaching: 5 min
Exercises: 1 minQuestions
Objectives
Understand the difference between a factory and an algorithm
Understand where to put the algorithm code
Understand the basic algorithm interface
Understand how to call an algorithm from a factory
The difference between a factory and an algorithm
Algorithms are classes that perform one kind of calculation we need and they do so in a generic, framework-independent way. The core of an Algorithm is a method called execute
which inputs some PODIO collections and outputs some other PODIO collections. Algorithms don’t know or care where the inputs come from and where they go. Algorithms also don’t know much about where their parameters come from; rather, they are passed a Config
structure which contains the parameters’ values. The nice thing about algorithms is that they are simple to design and test, and easy to reuse for different detectors, frameworks, or even entire experiments.
Most of what makes an Algorithm an Algorithm is convention. (These are largely inspired by the KISS principle in software engineering!) There is an ongoing effort to create a “framework-less framework” for formally expressing Algorithms using templates, which lives at https://github.com/eic/algorithms. Eventually, we may encourage users to have all Algorithms inherit from the Algorithm<Input<...>, Output<...>>
templated interface. For now, however, just follow the Algorithm conventions that we will go over next.
Where to put the algorithm code
All algorithms that are not specific to a single detector should go under src/algorithms
. Because this falls in the category of reconstruction, we’ll put it in src/algorithms/reco
.
The basic algorithm interface
Here is a template for an algorithm header file:
#pragma once
// #include relevant header files here
namespace eicrecon {
class MyAlgorithmName {
public:
// init function contains any required initialization
void init();
// execute function contains main algorithm processes
// (e.g. manipulate existing objects to create new objects)
std::unique_ptr<MyReturnDataType> execute();
// Any additional public members go here
private:
std::shared_ptr<spdlog::logger> m_log;
// any additional private members (e.g. services and calibrations) go here
};
} // namespace eicrecon
How to call an algorithm from a factory
The code to call an algorithm from a factory generally follows a specific pattern:
void Configure() {
// This is called when the factory is instantiated.
// Use this callback to make sure the algorithm is configured.
// The logger, parameters, and services have all been fetched before this is called
m_algo = std::make_unique<eicrecon::ElectronReconstruction>();
// Pass config object to algorithm
m_algo->applyConfig(config());
// If we needed geometry, we'd obtain it like so
// m_algo->init(m_geoSvc().detector(), m_geoSvc().converter(), logger());
m_algo->init(logger());
}
void Process(int64_t run_number, uint64_t event_number) {
// This is called on every event.
// Use this callback to call your Algorithm using all inputs and outputs
// The inputs will have already been fetched for you at this point.
auto output = m_algo->execute(
m_in_mc_particles(),
m_in_rc_particles(),
m_in_rc_particles_assoc(),
m_in_clu_assoc()
);
m_out_reco_particles() = std::move(output);
// JANA will take care of publishing the outputs for you.
}
Exercise:
- Create your own ElectronReconstruction algorithm using the code skeleton above.
- Print some log messages from your algorithm’s
execute()
method.- Have your ElectronReconstruction factory call the algorithm.
- Run this end-to-end.
Key Points
Working with PODIO
Overview
Teaching: 5 min
Exercises: 1 minQuestions
Objectives
Gain familiarity working with PODIO collections
Understand PODIO subset collections
Introduction to PODIO
Our data model is in a library/namespace/repository called edm4eic
, and it is built on top of edm4hep
, a data model designed to capture commonalities across HEP experiments. edm4eic
is implemented using PODIO, which is a toolkit for generating the data model classes from a specification written in YAML. Here is a very simple example of a PODIO specification:
options :
# should getters / setters be prefixed with get / set?
getSyntax: False
# should POD members be exposed with getters/setters in classes that have them as members?
exposePODMembers: True
includeSubfolder: True
datatypes :
ExampleHit :
Description : "Hit"
Author : "B. Hegner"
Members:
- unsigned long long cellID // cellID
- double x // x-coordinate
- double y // y-coordinate
- double z // z-coordinate
- double energy // measured energy deposit
ExampleCluster :
Description : "Cluster"
Author : "N. Brei"
Members:
- double energy // cluster energy
OneToManyRelations:
- ExampleHit Hits // hits contained in the cluster
- ExampleCluster Clusters // sub clusters used to create this cluster
PODIO will then generate for us the following classes:
DatamodelDefinition.h ExampleCluster.h ExampleClusterCollection.h ExampleClusterCollectionData.h ExampleClusterData.h ExampleClusterObj.h ExampleHit.h ExampleHitCollection.h ExampleHitCollectionData.h ExampleHitData.h ExampleHitObj.h MutableExampleHit.h MutableExampleCluster.h
As you can see, PODIO has a lot of moving pieces. Why?
- PODIO adds a separate layer for managing memory in a way which is more consistent with Python and other garbage-collected languages. The user only has to work with values, no explicit allocations or deletions.
- PODIO separates the data’s memory layout from its accessors
- PODIO enforces immutability directly in the object model
- PODIO has sophisticated (though fragile!) mechanisms for tracking object references
These design principles in principle should eliminate entire classes of bugs. However, there are still subtleties when using PODIO that can lead to leaks, crashes, or corrupted references. Luckily, the correct usage pattern is quite simple, as we will discuss next.
Working with PODIO objects, collections, and subset collections:
auto hits = std::make_unique<ExampleHitCollection>();
hits->push_back(ExampleHit(22, 0.0, 0.0, 0.0, 0.001));
MutableExampleHit hit;
hit.x(0.0);
hit.energy(0.001);
// ...
hits->push_back(hit);
MutableExampleCluster cluster;
cluster.addHits(hit);
// Safety tip: Add object to a collection BEFORE creating an association to it
auto clusters = std::make_unique<ExampleClusterCollection>();
clusters->push_back(cluster);
auto subset_clusters = std::make_unique<ExampleClusterCollection>();
subset_clusters->setSubsetCollection(true);
subset_clusters->push_back(cluster);
// Safety tip: Every PODIO object is owned by exactly one collection.
// If you want to put the object in other collections, those collections need to
// be designated as "subset collections", which means that they don't own their contents.
Note that when you write a factory, its inputs will be const ExampleHitCollection*
, which are immmutable.
Its output will be std::unique_ptr<ExampleHitCollection>
, which is still mutable but will transfer its ownership to JANA2. JANA2 will add the collection to a podio Frame
. From that point on, the collection is immutable and owned by the Frame
.
JANA2 will create and destroy Frame
s internally.
Exercise:
- Have your algorithm produce some (fake) output data!
Key Points
Putting everything together
Overview
Teaching: 5 min
Exercises: 0 minQuestions
Objectives
Get the electron finder running end-to-end
The final ReconstructedElectron factory
Here is the final ReconstructedElectron factory. Our algorithm requires reconstructed tracks, calorimeter clusters, and associations between the two. As previously stated, the current implementation of the simple electron ID uses the truth association between tracks and clusters (association using matching between clusters and track projections will be implemented later). Thus, we need two sets of associations: association between the truth particle and the reconstructed charged particle, and association between the truth particle and the calorimeter cluster. Obviously, we will not create these objects from scratch. Rather, we will get them from the factories (and underlying algorithms) implemented to create these objects.
src/global/reco/ReconstructedElectrons_factory.h
in branch nbrei_variadic_omnifactories
:
#pragma once
#include "extensions/jana/JOmniFactory.h"
#include "algorithms/reco/ElectronReconstruction.h"
namespace eicrecon {
class ReconstructedElectrons_factory : public JOmniFactory<ReconstructedElectrons_factory, ElectronReconstructionConfig> {
private:
// Underlying algorithm
std::unique_ptr<eicrecon::ElectronReconstruction> m_algo;
// Declare inputs
PodioInput<edm4hep::MCParticle> m_in_mc_particles {this, "MCParticles"};
PodioInput<edm4eic::ReconstructedParticle> m_in_rc_particles {this, "ReconstructedChargedParticles"};
PodioInput<edm4eic::MCRecoParticleAssociation> m_in_rc_particles_assoc {this, "ReconstructedChargedParticleAssociations"};
VariadicPodioInput<edm4eic::MCRecoClusterParticleAssociation> m_in_clu_assoc {this};
// Declare outputs
PodioOutput<edm4eic::ReconstructedParticle> m_out_reco_particles {this};
// Declare parameters
ParameterRef<double> m_min_energy_over_momentum {this, "minEnergyOverMomentum", config().min_energy_over_momentum};
ParameterRef<double> m_max_energy_over_momentum {this, "maxEnergyOverMomentum", config().max_energy_over_momentum};
// Declare services here, e.g.
// Service<DD4hep_service> m_geoSvc {this};
public:
void Configure() {
// This is called when the factory is instantiated.
// Use this callback to make sure the algorithm is configured.
// The logger, parameters, and services have all been fetched before this is called
m_algo = std::make_unique<eicrecon::ElectronReconstruction>();
// Pass config object to algorithm
m_algo->applyConfig(config());
// If we needed geometry, we'd obtain it like so
// m_algo->init(m_geoSvc().detector(), m_geoSvc().converter(), logger());
m_algo->init(logger());
}
void ChangeRun(int64_t run_number) {
// This is called whenever the run number is changed.
// Use this callback to retrieve state that is keyed off of run number.
// This state should usually be managed by a Service.
// Note: You usually don't need this, because you can declare a Resource instead.
}
void Process(int64_t run_number, uint64_t event_number) {
// This is called on every event.
// Use this callback to call your Algorithm using all inputs and outputs
// The inputs will have already been fetched for you at this point.
auto output = m_algo->execute(
m_in_mc_particles(),
m_in_rc_particles(),
m_in_rc_particles_assoc(),
m_in_clu_assoc()
);
logger()->debug( "Event {}: Found {} reconstructed electron candidates", event_number, output->size() );
m_out_reco_particles() = std::move(output);
// JANA will take care of publishing the outputs for you.
}
};
} // namespace eicrecon
Next, we register this with the reco
plugin in src/global/reco/reco.cc:
app->Add(new JOmniFactoryGeneratorT<ReconstructedElectrons_factory>(
"ReconstructedElectrons",
{"MCParticles", "ReconstructedChargedParticles", "ReconstructedChargedParticleAssociations",
"EcalBarrelScFiClusterAssociations",
"EcalEndcapNClusterAssociations",
"EcalEndcapPClusterAssociations",
"EcalEndcapPInsertClusterAssociations",
"EcalLumiSpecClusterAssociations",
},
{"ReconstructedElectrons"},
{}, // Override config values here
app
));
And finally, we add its output collection name to the output include list in src/services/io/podio/JEventProcessorPODIO:
"ReconstructedElectrons",
src/algorithms/reco/ElectronReconstruction.h
:
#pragma once
#include <edm4eic/MCRecoClusterParticleAssociationCollection.h>
#include <edm4eic/MCRecoParticleAssociationCollection.h>
#include <edm4eic/ReconstructedParticleCollection.h>
#include <edm4hep/MCParticleCollection.h>
#include <spdlog/logger.h>
#include <memory>
#include <vector>
#include "ElectronReconstructionConfig.h"
#include "algorithms/interfaces/WithPodConfig.h"
namespace eicrecon {
class ElectronReconstruction : public WithPodConfig<ElectronReconstructionConfig>{
public:
// Initialization will set the pointer of the logger
void init(std::shared_ptr<spdlog::logger> logger);
// Algorithm will apply E/p cut to reconstructed tracks based on truth track-cluster associations
std::unique_ptr<edm4eic::ReconstructedParticleCollection> execute(
const edm4hep::MCParticleCollection *mcparts,
const edm4eic::ReconstructedParticleCollection *rcparts,
const edm4eic::MCRecoParticleAssociationCollection *rcassoc,
const std::vector<const edm4eic::MCRecoClusterParticleAssociationCollection*> &in_clu_assoc
);
// Could overload execute here to, for instance, use track projections
// for track-cluster association (instead of truth information)
private:
std::shared_ptr<spdlog::logger> m_log; // pointer to logger
double m_electron{0.000510998928}; // electron mass
};
} // namespace eicrecon
Note that the algorithm’s parameters are enclosed in a Config object which lives at src/algorithms/reco/ElectronReconstructionConfig.h
and can be accessed via the protected member variable m_cfg
.
struct ElectronReconstructionConfig {
double min_energy_over_momentum = 0.9;
double max_energy_over_momentum = 1.2;
};
The algorithm itself lives at src/algorithms/reco/ElectronReconstruction.cc
:
#include "ElectronReconstruction.h"
#include <edm4eic/ClusterCollection.h>
#include <edm4hep/utils/vector_utils.h>
#include <fmt/core.h>
namespace eicrecon {
void ElectronReconstruction::init(std::shared_ptr<spdlog::logger> logger) {
m_log = logger;
}
std::unique_ptr<edm4eic::ReconstructedParticleCollection> ElectronReconstruction::execute(
const edm4hep::MCParticleCollection *mcparts,
const edm4eic::ReconstructedParticleCollection *rcparts,
const edm4eic::MCRecoParticleAssociationCollection *rcassoc,
const std::vector<const edm4eic::MCRecoClusterParticleAssociationCollection*> &in_clu_assoc
) {
// Step 1. Loop through MCParticle - cluster associations
// Step 2. Get Reco particle for the Mc Particle matched to cluster
// Step 3. Apply E/p cut using Reco cluster Energy and Reco Particle momentum
// Some obvious improvements:
// - E/p cut from real study optimized for electron finding and hadron rejection
// - use of any HCAL info?
// - check for duplicates?
// output container
auto out_electrons = std::make_unique<edm4eic::ReconstructedParticleCollection>();
for ( const auto *col : in_clu_assoc ){ // loop on cluster association collections
for ( auto clu_assoc : (*col) ){ // loop on MCRecoClusterParticleAssociation in this particular collection
auto sim = clu_assoc.getSim(); // McParticle
auto clu = clu_assoc.getRec(); // RecoCluster
m_log->trace( "SimId={}, CluId={}", clu_assoc.getSimID(), clu_assoc.getRecID() );
m_log->trace( "MCParticle: Energy={} GeV, p={} GeV, E/p = {} for PDG: {}", clu.getEnergy(), edm4hep::utils::magnitude(sim.getMomentum()), clu.getEnergy() / edm4hep::utils::magnitude(sim.getMomentum()), sim.getPDG() );
// Find the Reconstructed particle associated to the MC Particle that is matched with this reco cluster
// i.e. take (MC Particle <-> RC Cluster) + ( MC Particle <-> RC Particle ) = ( RC Particle <-> RC Cluster )
auto reco_part_assoc = rcassoc->begin();
for (; reco_part_assoc != rcassoc->end(); ++reco_part_assoc) {
if (reco_part_assoc->getSimID() == (unsigned) clu_assoc.getSimID()) {
break;
}
}
// if we found a reco particle then test for electron compatibility
if ( reco_part_assoc != rcassoc->end() ){
auto reco_part = reco_part_assoc->getRec();
double EoverP = clu.getEnergy() / edm4hep::utils::magnitude(reco_part.getMomentum());
m_log->trace( "ReconstructedParticle: Energy={} GeV, p={} GeV, E/p = {} for PDG (from truth): {}", clu.getEnergy(), edm4hep::utils::magnitude(reco_part.getMomentum()), EoverP, sim.getPDG() );
// Apply the E/p cut here to select electons
if ( EoverP >= m_cfg.min_energy_over_momentum && EoverP <= m_cfg.max_energy_over_momentum ) {
out_electrons->push_back(reco_part.clone());
}
} else {
m_log->debug( "Could not find reconstructed particle for SimId={}", clu_assoc.getSimID() );
}
} // loop on MC particle to cluster associations in collection
} // loop on collections
m_log->debug( "Found {} electron candidates", out_electrons->size() );
return out_electrons;
}
}
Key Points