diff --git a/Events.md b/Events.md new file mode 100644 index 00000000..511eefbd --- /dev/null +++ b/Events.md @@ -0,0 +1,84 @@ +# Events + +An [Event](https://github.com/X-DataInitiative/SCALPEL-Extraction/blob/master/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Event.scala) + is a figure that allows to homogenize the content of the output datasets. +It consists, at least, of an object to be instantiated and a trait containing the necessary functions. +In order to create an `Event` you first need to understand what it is. +An event is an occurrence defined by having a patient identifier, category, start, group identifier, value, weight and end; +of these only the first three are mandatory.The 7 elements needed to form an `Event` are: +- patientID: is the patient identifier. +- category: define the event's category. +- groupID: contains the ID of a group of related events. +- value: contains string values fot the molecule name, the diagnosis code, etc. +- weight: contains double values for medical acts weighting or other numerical values. +- start: is the start of event's period. +- end: is the end of event's period. +```scala + patientID: String, + category: EventCategory[A], + groupID: String, + value: String, + weight: Double, + start: Timestamp, + end: Option[Timestamp] +``` +All events inherit [AnyEvent](https://github.com/X-DataInitiative/SCALPEL-Extraction/blob/master/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/AnyEvent.scala) + and [EventBuilder](https://github.com/X-DataInitiative/SCALPEL-Extraction/blob/master/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/EventBuilder.scala). + `AnyEvent` is a trait with category value, and `EventBuilder` is a trait to built an `Event`. + + +[ObservationPeriod](https://github.com/X-DataInitiative/SCALPEL-Extraction/blob/master/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/ObservationPeriod.scala) +and [MedicalAct](https://github.com/X-DataInitiative/SCALPEL-Extraction/blob/master/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/MedicalAct.scala) +are good examples to illustrate the construction of an `Event`. +The first one,`ObservationPeriod`, define the values `patientID`, `category`, `start` and if exist `end`. +The other values are present but use default values. Its trait just assigns the value of the category and uses the apply method to build an `Event`. +```scala + val category: EventCategory[ObservationPeriod] = "observation_period" + + /** Creates un Event object of type ObservationPeriod using a map function to map a dataset. + * + * @param patientID The value patientID from dataset. + * @param start The value start from dataset. + * @param end The value end from dataset. + * @return Event[ObservationPeriod]. + */ + def apply(patientID: String, start: Timestamp, end: Timestamp): Event[ObservationPeriod] = + Event(patientID, category, groupID = "NA", value = "NA", weight = 0D, start, Some(end)) +``` +And its object doesn't add up to anything, it just inherits the trait. +```scala +object ObservationPeriod extends ObservationPeriod +``` +On the other side,`MedicalAct` defines values for all elements except `end` one. +Its trait uses two apply methods to assign values according to need and not assign any value to category one, +it's assigned in each object according to the type. + +```scala + + override val category: EventCategory[MedicalAct] + + def apply(patientID: String, groupID: String, code: String, weight: Double, date: Timestamp): Event[MedicalAct] = { + Event(patientID, category, groupID, code, weight, date, None) + } + + def apply(patientID: String, groupID: String, code: String, date: Timestamp): Event[MedicalAct] = { + Event(patientID, category, groupID, code, 0.0, date, None) + } +``` +The MedicalAct's object are various in accordance with the type of medical act. +Their objects assign categories and in some cases have object that stores groupID values. + +```scala +object BiologyDcirAct extends MedicalAct { + override val category: EventCategory[MedicalAct] = "dcir_biology_act" + + object groupID { + val PrivateAmbulatory = "private_ambulatory" + val PublicAmbulatory = "public_ambulatory" + val PrivateHospital = "private_hospital" + val Liberal = "liberal" + val DcirAct = "DCIR_act" + val Unknown = "unknown_source" + } +} +``` \ No newline at end of file diff --git a/Extractors.md b/Extractors.md new file mode 100644 index 00000000..38939ce6 --- /dev/null +++ b/Extractors.md @@ -0,0 +1,72 @@ + + +# Extractors + +Extractors are a kind of jobs that allows to extract the required columns from the sources and maps them to the `Event` ([Events](Events.md)) it is extracting. +An extractor is composed of several basic components that are grouped to give all the functionalities to the extractors. +From a point of view hierarchical we have: + +1) Base elements are traits as [Extractor](https://github.com/X-DataInitiative/SCALPEL-Extraction/blob/master/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/Extractor.scala), +[McoSource](https://github.com/X-DataInitiative/SCALPEL-Extraction/blob/master/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/mco/McoSource.scala) and +[EventRowExtractor](https://github.com/X-DataInitiative/SCALPEL-Extraction/blob/master/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/EventRowExtractor.scala). +Each manages different parts of extraction. They are all necessary for the creation of an element of the next level. +2) Intermediate elements are traits that bring together the basic methods to create a common trait to extract the data from the sources (mco,dcir,ssr,had).In our example, +[McoExtractor](https://github.com/X-DataInitiative/SCALPEL-Extraction/blob/master/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/mco/McoExtractor.scala) +is the base for retrieving the necessary data from mco source. These elements inherit the base elements and accept as a parameter a trait of the type EventType +3) The upper elements are objects and they are the entry point of the job itself, they inherit from the middle elements and specialize it in one type of `Event`. + +Each basic component has a function: + - [Extractor](https://github.com/X-DataInitiative/SCALPEL-Extraction/blob/master/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/Extractor.scala) + is a trait that works by filtering, extracting and building.Its main method `extract ` allows to filter the sources and build a dataset of type `Event`. +```scala +def extract(sources: Sources, codes: Set[String])(implicit ctag: TypeTag[EventType]): Dataset[Event[EventType]] = { + val input: DataFrame = getInput(sources) + import input.sqlContext.implicits._ + { + if (codes.isEmpty) { + input.filter(isInExtractorScope _) + } + else { + input.filter(isInExtractorScope _).filter(isInStudy(codes) _) + } + }.flatMap(builder _).distinct() + } +``` +- [McoSource](https://github.com/X-DataInitiative/SCALPEL-Extraction/blob/master/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/mco/McoSource.scala) +This trait is specific to each source (mco,dcir,ssr,had), containing the values relating to the columns and the methods specific to that source. +- [EventRowExtractor](https://github.com/X-DataInitiative/SCALPEL-Extraction/blob/master/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/EventRowExtractor.scala) +This trait contains methods for extracting the fields needed to create an `Event`. + +The intermediate elements implement the required methods and adapt them if necessary. +Two good examples of implementation and modification to suit the specificities of `Event` type are `builder` method and `extractGroupId` method. + +```scala +def builder(row: Row): Seq[Event[EventType]] = { + lazy val patientId = extractPatientId(row) + lazy val groupId = extractGroupId(row) + lazy val eventDate = extractStart(row) + lazy val endDate = extractEnd(row) + lazy val weight = extractWeight(row) + + Seq(eventBuilder[EventType](patientId, groupId, code(row), weight, eventDate, endDate)) + } +``` +```scala + override def extractGroupId(r: Row): String = { + r.getAs[String](ColNames.EtaNum) + "_" + + r.getAs[String](ColNames.RsaNum) + "_" + + r.getAs[Int](ColNames.Year).toString + } +``` + +The above elements are responsible for defining the type of `Event` and must implement at least `columnName` and `eventBuilder` values and modify them according to their specificity. +For exemple at [McoHospitalStaysExtractor](https://github.com/X-DataInitiative/SCALPEL-Extraction/blob/master/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/hospitalstays/McoHospitalStaysExtractor.scala) +```scala +object McoHospitalStaysExtractor extends McoExtractor[HospitalStay]{ + + override val columnName: String = ColNames.EndDate + override val eventBuilder: EventBuilder = McoHospitalStay +} +``` + + diff --git a/Transformer.md b/Transformer.md new file mode 100644 index 00000000..d35f5749 --- /dev/null +++ b/Transformer.md @@ -0,0 +1,41 @@ +# Transformers + +A transformer is used to turn one or multiple datasets +into another one. A `Transformer` accept a configuration class as a parameter +a configuration class of type + [TransformerConfig](https://github.com/X-DataInitiative/SCALPEL-Extraction/blob/master/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/TransformerConfig.scala). +This configuration file control the behaviour of the `Transformer` through its methods and values. + + +`transform` is the main and the unique public available method of `Transformer`, this method accepts one or several datasets as input + and after the application of the transformation logic, returns a dataset of type `Event`([Events](Events.md)) with a new `category`. + + If we take as an example [ExposureTransformer](https://github.com/X-DataInitiative/SCALPEL-Extraction/blob/master/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/exposures/ExposureTransformer.scala), + we pass an instance of the class [ExposuresTransformerConfig](https://github.com/X-DataInitiative/SCALPEL-Extraction/blob/master/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/exposures/ExposuresTransformerConfig.scala) + as parameter, it extends a `TransformerConfig`. + ```scala +/** + * A tag to improve readability of subclasses and to allow type binding + */ + +trait TransformerConfig + ``` +The config class is used to control the Transformer's behaviour. + ```scala +class ExposuresTransformerConfig( + val exposurePeriodAdder: ExposurePeriodAdder) extends TransformerConfig with Serializable + ``` +This main method take as parameter two datasets of type `Event` and subtypes `FollowUp` and `Drug` respectively +and return a dataset of type `Event` and subtype `Exposure`. + +```scala + def transform(followUps: Dataset[Event[FollowUp]])(drugs: Dataset[Event[Drug]]): Dataset[Event[Exposure]] = { + drugs + .transform(config.exposurePeriodAdder.toExposure(followUps)) + .transform(regulateWithFollowUps(followUps)) + } + ``` +The objective of the `transform` method is thus to combine, filter, check relations between multiples `Event`s to form a new `Event`. +In the example of the `ExposureTransformer`, it combines multiples `Drug` `Event`s based on the logic of +the `exposurePeriodAdder` of the `ExposuresTransformerConfig` to form an `Exposure` while +making sure that it is contained within the period defined in the `FollowUp` `Event`. \ No newline at end of file diff --git a/project/build.properties b/project/build.properties index 64317fda..8e682c52 100644 --- a/project/build.properties +++ b/project/build.properties @@ -1 +1 @@ -sbt.version=0.13.15 +sbt.version=0.13.18 diff --git a/src/main/resources/config/bulk/default.conf b/src/main/resources/config/bulk/default.conf index c2ebd5f8..7f65caa0 100644 --- a/src/main/resources/config/bulk/default.conf +++ b/src/main/resources/config/bulk/default.conf @@ -1,5 +1,8 @@ root { - + drugs { + level: "cip13" + families: [] + } } diff --git a/src/main/resources/config/bulk/paths/cmap.conf b/src/main/resources/config/bulk/paths/cmap.conf index 295b2c5f..4e84e3d9 100644 --- a/src/main/resources/config/bulk/paths/cmap.conf +++ b/src/main/resources/config/bulk/paths/cmap.conf @@ -1,10 +1,10 @@ env_name = "cmap" input = { - dcir = "/shared/Observapur/staging/Flattening/flat_table/DCIR" - mco_ce = "/shared/Observapur/staging/Flattening/flat_table/MCO_CE" - mco = "/shared/Observapur/staging/Flattening/flat_table/MCO" - ir_ben = "/shared/Observapur/staging/Flattening/single_table/IR_BEN_R" + dcir = "/user/ds/CNAM447/flattening/flat_table/DCIR" + mco_ce = "/user/ds/CNAM447bis/flattening/flat_table/MCO_CE" + mco = "/user/ds/CNAM447bis/flattening/flat_table/MCO" + ir_ben = "/user/ds/CNAM447/flattening/single_table/IR_BEN_R" ir_imb = "/shared/Observapur/staging/Flattening/single_table/IR_IMB_R" ir_pha = "/shared/Observapur/staging/Flattening/single_table/IR_PHA_R_MOL" } diff --git a/src/main/resources/config/fall/default.conf b/src/main/resources/config/fall/default.conf index 1cc02a53..d3a30116 100644 --- a/src/main/resources/config/fall/default.conf +++ b/src/main/resources/config/fall/default.conf @@ -11,8 +11,9 @@ root { to_exposure_strategy = "purchase_count_based" } } - interaction { + interactions { level: 2 + minimum_duration: 30 days } drugs { level: "Therapeutic" @@ -28,7 +29,7 @@ root { fall_frame: 0 months // fractures are grouped if they happen in the same site within the period fallFrame, (default value 0 means no group) } run_parameters { - outcome: ["Acts", "Diagnoses", "Outcomes"] // pipeline of calculation of outcome, possible values : Acts, Diagnoses, and Outcomes + outcome: ["Acts", "Diagnoses", "HospitalDeaths", "Outcomes"] // pipeline of calculation of outcome, possible values : Acts, Diagnoses, and Outcomes exposure: ["Patients", "StartGapPatients", "DrugPurchases", "Exposures"] // pipeline of the calculation of exposure, possible values : Patients, StartGapPatients, DrugPurchases, Exposures } } diff --git a/src/main/resources/config/fall/template.conf b/src/main/resources/config/fall/template.conf index 2b69453c..463924a6 100644 --- a/src/main/resources/config/fall/template.conf +++ b/src/main/resources/config/fall/template.conf @@ -16,8 +16,10 @@ # exposures.end_threshold_gc: 90 days // If periodStrategy="limited", represents the period without purchases for an exposure to be considered "finished". # exposures.end_threshold_ngc: 30 days // If periodStrategy="limited", represents the period without purchases for an exposure to be considered "finished". # exposures.end_delay: 30 days // Number of periods that we add to the exposure end to delay it (lag). +# exposures.to_exposure_strategy: purchase_count_based // possible values "purchase_count_based" or "lastest_purchase_based" # interactions.level: 3 // Integer representing the maximum number of values of Interaction. Please be careful as this not scale well beyond 5 when the data contains a patient with very high number of exposures +# interactions.minimum_duration: 30 days // If Interaction duration is less than this value, it is not considered. Proxy for medication change. # drugs.level: "Therapeutic" // Options are Therapeutic, Pharmacological, MoleculeCombination # drugs.families: ["Antihypertenseurs", "Antidepresseurs", "Neuroleptiques", "Hypnotiques"] diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/config/ConfigLoader.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/config/ConfigLoader.scala index e80fd271..da78e94f 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/config/ConfigLoader.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/config/ConfigLoader.scala @@ -11,7 +11,7 @@ import me.danielpes.spark.datetime.implicits._ import pureconfig._ import pureconfig.configurable.{localDateConfigConvert, localDateTimeConfigConvert} import pureconfig.generic.{CoproductHint, EnumCoproductHint, FieldCoproductHint, ProductHint} -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.level.DrugClassificationLevel +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level.DrugClassificationLevel import fr.polytechnique.cmap.cnam.etl.transformers.exposures.ExposurePeriodAdder trait ConfigLoader { diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/config/study/StudyConfig.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/config/study/StudyConfig.scala index b92f496b..4df4c563 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/config/study/StudyConfig.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/config/study/StudyConfig.scala @@ -12,12 +12,13 @@ object StudyConfig { dcir: Option[String] = None, mco: Option[String] = None, mcoCe: Option[String] = None, - ssr: Option[List[String]] = None, + ssr: Option[String] = None, ssrCe: Option[String] = None, had: Option[String] = None, irBen: Option[String] = None, irImb: Option[String] = None, irPha: Option[String] = None, + irNat: Option[String] = None, dosages: Option[String] = None) case class OutputPaths( diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/datatypes/Period.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/datatypes/Period.scala index 29272f60..cb19c7f2 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/datatypes/Period.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/datatypes/Period.scala @@ -3,7 +3,6 @@ package fr.polytechnique.cmap.cnam.etl.datatypes import java.sql.Timestamp -import fr.polytechnique.cmap.cnam.etl.transformers.interaction._ import fr.polytechnique.cmap.cnam.util.functions._ case class Period(start: Timestamp, end: Timestamp) extends Subtractable[Period] with Addable[Period]{ diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Classification.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Classification.scala index 9f2b0061..859ad0b0 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Classification.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Classification.scala @@ -3,28 +3,12 @@ package fr.polytechnique.cmap.cnam.etl.events import java.sql.Timestamp -import org.apache.spark.sql.Row trait Classification extends AnyEvent with EventBuilder { val category: EventCategory[Classification] - def fromRow( - r: Row, - patientIDCol: String = "patientID", - nameCol: String = "name", - groupIDCol: String = "groupID", - dateCol: String = "eventDate") - : Event[Classification] = { - apply( - r.getAs[String](patientIDCol), - r.getAs[String](groupIDCol), - r.getAs[String](nameCol), - r.getAs[Timestamp](dateCol) - ) - } - def apply( patientID: String, groupID: String, diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Diagnosis.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Diagnosis.scala index ba063754..c55a512e 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Diagnosis.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Diagnosis.scala @@ -3,57 +3,27 @@ package fr.polytechnique.cmap.cnam.etl.events import java.sql.Timestamp -import org.apache.spark.sql.Row trait Diagnosis extends AnyEvent with EventBuilder { val category: EventCategory[Diagnosis] - def fromRow(r: Row, patientIDCol: String, codeCol: String, dateCol: String): Event[Diagnosis] = { - apply(r.getAs[String](patientIDCol), r.getAs[String](codeCol), r.getAs[Timestamp](dateCol)) - } - def apply(patientID: String, code: String, date: Timestamp): Event[Diagnosis] = { Event(patientID, category, groupID = "NA", code, 0.0, date, None) } - def fromRow( - r: Row, - patientIDCol: String = "patientID", - groupIDCol: String = "groupID", - codeCol: String = "code", - dateCol: String = "eventDate"): Event[Diagnosis] = { - apply( - r.getAs[String](patientIDCol), - r.getAs[String](groupIDCol), - r.getAs[String](codeCol), - r.getAs[Timestamp](dateCol) - ) + def apply(patientID: String, code: String, date: Timestamp, endDate: Option[Timestamp]): Event[Diagnosis] = { + Event(patientID, category, groupID = "NA", code, 0.0, date, endDate) } def apply(patientID: String, groupID: String, code: String, date: Timestamp): Event[Diagnosis] = { Event(patientID, category, groupID, code, 0.0, date, None) } - def fromRow( - r: Row, - patientIDCol: String, - groupIDCol: String, - codeCol: String, - weightCol: String, - dateCol: String): Event[Diagnosis] = { - apply( - r.getAs[String](patientIDCol), - r.getAs[String](groupIDCol), - r.getAs[String](codeCol), - r.getAs[Double](weightCol), - r.getAs[Timestamp](dateCol) - ) - } - def apply(patientID: String, groupID: String, code: String, weight: Double, date: Timestamp): Event[Diagnosis] = { Event(patientID, category, groupID, code, weight, date, None) } + } object McoMainDiagnosis extends Diagnosis { @@ -92,6 +62,6 @@ object SsrTakingOverPurpose extends Diagnosis { val category: EventCategory[Diagnosis] = "ssr_taking_over_purpose" } -object ImbDiagnosis extends Diagnosis { - override val category: EventCategory[Diagnosis] = "imb_diagnosis" +object ImbCcamDiagnosis extends Diagnosis { + override val category: EventCategory[Diagnosis] = "imb_ccam_diagnosis" } \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Drug.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Drug.scala index fad54865..b60279f3 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Drug.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Drug.scala @@ -3,7 +3,6 @@ package fr.polytechnique.cmap.cnam.etl.events import java.sql.Timestamp -import org.apache.spark.sql.Row object Drug extends Drug @@ -11,21 +10,6 @@ trait Drug extends Dispensation with EventBuilder { override val category: EventCategory[Drug] = "drug" - def fromRow( - r: Row, - patientIDCol: String = "patientID", - nameCol: String = "name", - dosageCol: String = "dosage", - dateCol: String = "eventDate"): Event[Drug] = { - - Drug( - r.getAs[String](patientIDCol), - r.getAs[String](nameCol), - r.getAs[Double](dosageCol), - r.getAs[Timestamp](dateCol) - ) - } - - def apply(patientID: String, name: String, dosage: Double, date: Timestamp): Event[Drug] = - Event(patientID, category, groupID = "NA", name, dosage, date, None) + def apply(patientID: String, name: String, dosage: Double, groupID: String, date: Timestamp): Event[Drug] = + Event(patientID, category, groupID, name, dosage, date, None) } \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/DrugPrescription.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/DrugPrescription.scala new file mode 100644 index 00000000..8a6f33df --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/DrugPrescription.scala @@ -0,0 +1,17 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.events + +import java.sql.Timestamp + +/** + * [[Event]] that combines [[Drug]]s to form a Prescription. + */ +trait DrugPrescription extends Dispensation with EventBuilder { + override val category: EventCategory[DrugPrescription] = "drug_prescription" + + def apply(patientID: String, name: String, dosage: Double, groupID: String, date: Timestamp): Event[DrugPrescription] = + Event(patientID, category, groupID, name, dosage, date, None) +} + +object DrugPrescription extends DrugPrescription \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Exposure.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Exposure.scala index 5aaa4e45..cbdd9f52 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Exposure.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Exposure.scala @@ -3,7 +3,6 @@ package fr.polytechnique.cmap.cnam.etl.events import java.sql.Timestamp -import org.apache.spark.sql.Row object Exposure extends Exposure @@ -11,23 +10,6 @@ trait Exposure extends AnyEvent with EventBuilder { val category: EventCategory[Exposure] = "exposure" - def fromRow( - r: Row, - patientIDCol: String = "patientID", - nameCol: String = "name", - weightCol: String = "weight", - startCol: String = "start", - endCol: String = "end"): Event[Exposure] = { - - Exposure( - r.getAs[String](patientIDCol), - r.getAs[String](nameCol), - r.getAs[Double](weightCol), - r.getAs[Timestamp](startCol), - r.getAs[Timestamp](endCol) - ) - } - def apply( patientID: String, molecule: String, weight: Double, start: Timestamp, end: Timestamp ): Event[Exposure] = Event(patientID, category, groupID = "NA", molecule, weight, start, Some(end)) diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/FollowUp.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/FollowUp.scala index f30ac57b..27b74a49 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/FollowUp.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/FollowUp.scala @@ -3,33 +3,23 @@ package fr.polytechnique.cmap.cnam.etl.events import java.sql.Timestamp -import org.apache.spark.sql.Row - +/** Factory for FollowUp instances. */ object FollowUp extends FollowUp - +/** This trait stores the methods required to create an Event object of type FollowUp. */ trait FollowUp extends AnyEvent with EventBuilder { val category: EventCategory[FollowUp] = "follow_up" - def fromRow( - r: Row, - patientIDCol: String = "patientID", - endReason: String = "endReason", - startCol: String = "start", - endCol: String = "end"): Event[FollowUp] = { - - FollowUp( - r.getAs[String](patientIDCol), - r.getAs[String](endReason), - r.getAs[Timestamp](startCol), - r.getAs[Timestamp](endCol) - ) - } - - + /** Creates un Event object of type FollowUp using a map function to map a dataset. + * + * @param patientID The value patientID from dataset. + * @param endReason The value endReason from dataset. + * @param start The value start from dataset. + * @param end The value end from dataset. + * @return Event[FollowUp]. + */ def apply(patientID: String, endReason: String, start: Timestamp, end: Timestamp): Event[FollowUp] = Event(patientID, category, groupID = "NA", endReason, weight = 0D, start, Some(end)) -} - +} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/HospitalStay.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/HospitalStay.scala index 0af3ef0c..84c08182 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/HospitalStay.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/HospitalStay.scala @@ -3,28 +3,22 @@ package fr.polytechnique.cmap.cnam.etl.events import java.sql.Timestamp -import org.apache.spark.sql.Row -import fr.polytechnique.cmap.cnam.etl.events.Event.Columns._ trait HospitalStay extends AnyEvent with EventBuilder { override val category: EventCategory[HospitalStay] = "hospital_stay" - def fromRow( - r: Row, - patientIDCol: String = PatientID, - hospitalIDCol: String = Value, - startCol: String = Start, - endCol: String = End): Event[HospitalStay] = - apply( - r.getAs[String](patientIDCol), - r.getAs[String](hospitalIDCol), - r.getAs[Timestamp](startCol), - r.getAs[Timestamp](endCol) - ) - def apply(patientID: String, hospitalID: String, start: Timestamp, end: Timestamp): Event[HospitalStay] = apply(patientID, hospitalID, hospitalID, 0D, start, Some(end)) + + def apply( + patientID: String, + hospitalID: String, + weight: Double, + start: Timestamp, + end: Timestamp): Event[HospitalStay] = + apply(patientID, hospitalID, hospitalID, weight, start, Some(end)) + } object HospitalStay extends HospitalStay @@ -33,16 +27,20 @@ object McoHospitalStay extends HospitalStay { override val category: EventCategory[HospitalStay] = "mco_hospital_stay" } +object McoceEmergency extends HospitalStay { + override val category: EventCategory[HospitalStay] = "mco_ce_emergency" +} + /** Hospital Stay in the SSR PMSI are one type of hospital stays, see : - * https://documentation-snds.health-data-hub.fr/glossaire/ssr.html - */ + * https://documentation-snds.health-data-hub.fr/glossaire/ssr.html + */ object SsrHospitalStay extends HospitalStay { override val category: EventCategory[HospitalStay] = "ssr_hospital_stay" } /** HAD Hospital Stay in the HAD PMSI are one type of hospital stays, see : - * https://documentation-snds.health-data-hub.fr/glossaire/had.html - */ + * https://documentation-snds.health-data-hub.fr/glossaire/had.html + */ object HadHospitalStay extends HospitalStay { override val category: EventCategory[HospitalStay] = "had_hospital_stay" } \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/MedicalAct.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/MedicalAct.scala index 082c9dbc..535e0af5 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/MedicalAct.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/MedicalAct.scala @@ -3,28 +3,11 @@ package fr.polytechnique.cmap.cnam.etl.events import java.sql.Timestamp -import org.apache.spark.sql.Row trait MedicalAct extends AnyEvent with EventBuilder { override val category: EventCategory[MedicalAct] - def fromRow( - r: Row, - patientIDCol: String = "patientID", - groupIDCol: String = "groupID", - codeCol: String = "code", - weightCol: String = "weight", - dateCol: String = "eventDate"): Event[MedicalAct] = { - this.apply( - r.getAs[String](patientIDCol), - r.getAs[String](groupIDCol), - r.getAs[String](codeCol), - r.getAs[Double](weightCol), - r.getAs[Timestamp](dateCol) - ) - } - def apply(patientID: String, groupID: String, code: String, weight: Double, date: Timestamp): Event[MedicalAct] = { Event(patientID, category, groupID, code, weight, date, None) } @@ -56,7 +39,7 @@ object McoCIM10Act extends MedicalAct { val category: EventCategory[MedicalAct] = "mco_cim10_act" } -object McoCEAct extends MedicalAct { +object McoCeCcamAct extends MedicalAct { val category: EventCategory[MedicalAct] = "mco_ce_act" } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/MedicalTakeOverReason.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/MedicalTakeOverReason.scala index c67ad0dd..7fd0b9af 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/MedicalTakeOverReason.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/MedicalTakeOverReason.scala @@ -1,28 +1,11 @@ package fr.polytechnique.cmap.cnam.etl.events import java.sql.Timestamp -import org.apache.spark.sql.Row trait MedicalTakeOverReason extends AnyEvent with EventBuilder { override val category: EventCategory[MedicalTakeOverReason] - def fromRow( - r: Row, - patientIDCol: String = "patientID", - groupIDCol: String = "groupID", - codeCol: String = "code", - weightCol: String = "weight", - dateCol: String = "eventDate"): Event[MedicalTakeOverReason] = { - this.apply( - r.getAs[String](patientIDCol), - r.getAs[String](groupIDCol), - r.getAs[String](codeCol), - r.getAs[Double](weightCol), - r.getAs[Timestamp](dateCol) - ) - } - def apply(patientID: String, groupID: String, code: String, weight: Double, date: Timestamp): Event[MedicalTakeOverReason] = { Event(patientID, category, groupID, code, weight, date, None) } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Molecule.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Molecule.scala index 2de0e379..feb6116f 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Molecule.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Molecule.scala @@ -3,7 +3,6 @@ package fr.polytechnique.cmap.cnam.etl.events import java.sql.Timestamp -import org.apache.spark.sql.Row object Molecule extends Molecule @@ -11,21 +10,6 @@ trait Molecule extends Dispensation with EventBuilder { override val category: EventCategory[Molecule] = "molecule" - def fromRow( - r: Row, - patientIDCol: String = "patientID", - nameCol: String = "name", - dosageCol: String = "dosage", - dateCol: String = "eventDate"): Event[Molecule] = { - - Molecule( - r.getAs[String](patientIDCol), - r.getAs[String](nameCol), - r.getAs[Double](dosageCol), - r.getAs[Timestamp](dateCol) - ) - } - def apply(patientID: String, name: String, dosage: Double, date: Timestamp): Event[Molecule] = Event(patientID, category, groupID = "NA", name, dosage, date, None) } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/NgapAct.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/NgapAct.scala new file mode 100644 index 00000000..160aad0e --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/NgapAct.scala @@ -0,0 +1,64 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.events + +import java.sql.Timestamp + +/** The NGAP is one of the two different nomenclatures used in the SNDS to facture healthcare acts. + * + * It concerns mainly nurses, physiotherapist masseurs and medical auxiliaries and + * some acts of dental surgeons as well as clinical acts of doctors. It has to be distinguished from the CCAM, + * which groups together the technical acts performed by doctors (much more precise). + * A updated date version can be found on the website of the CNAM : + * https://www.ameli.fr/ain/masseur-kinesitherapeute/exercice-liberal/facturation-remuneration/nomenclatures-ngap-et-lpp/nomenclatures-ngap-lpp) + * + */ +trait NgapAct extends AnyEvent with EventBuilder { + + override val category: EventCategory[NgapAct] = "ngap_act" + + def apply(patientID: String, groupID: String, ngapCoefficient: String, weight: Double, date: Timestamp): Event[NgapAct] = { + Event(patientID, category, groupID, ngapCoefficient, weight, date, None) + } + + def apply(patientID: String, groupID: String, ngapCoefficient: String, date: Timestamp): Event[NgapAct] = { + Event(patientID, category, groupID, ngapCoefficient, 0.0, date, None) + } +} + + + +object DcirNgapAct extends NgapAct { + override val category: EventCategory[NgapAct] = "dcir_ngap_act" + + object groupID { + val PrivateAmbulatory = "private_ambulatory" + val PublicAmbulatory = "public_ambulatory" + val PrivateHospital = "private_hospital" + val Liberal = "liberal" + val DcirNgapAct = "dcir_ngap_act" + val Unknown = "unknown_source" + } + +} + +/** + * Tables of hospital services (FBSTC) and procedures (FCSTC) are not completed for each stay and are complementary. + * All the details are in the collaborative documentation on the SNDS here : + * https://documentation-snds.health-data-hub.fr/fiches/actes_consult_externes.html#reperage-des-ace-dans-la-table-des-prestations-dcir + */ +object McoCeFbstcNgapAct extends NgapAct { + override val category: EventCategory[NgapAct] = "mco_ce_fbstc_act" +} + +object McoCeFcstcNgapAct extends NgapAct { + override val category: EventCategory[NgapAct] = "mco_ce_fcstc_act" +} + +object SsrCeFbstcNgapAct extends NgapAct { + override val category: EventCategory[NgapAct] = "ssr_ce_fbstc_act" +} + +object SsrCeFcstcNgapAct extends NgapAct { + override val category: EventCategory[NgapAct] = "ssr_ce_fcstc_act" +} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/ObservationPeriod.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/ObservationPeriod.scala index a3f8c16a..b81b6475 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/ObservationPeriod.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/ObservationPeriod.scala @@ -1,28 +1,22 @@ package fr.polytechnique.cmap.cnam.etl.events import java.sql.Timestamp -import org.apache.spark.sql.Row +/** Factory for ObservationPeriod instances. */ object ObservationPeriod extends ObservationPeriod +/** This trait stores the methods required to create an Event object of type ObservationPeriod. */ trait ObservationPeriod extends AnyEvent with EventBuilder { val category: EventCategory[ObservationPeriod] = "observation_period" - def fromRow( - r: Row, - patientIDCol: String = "patientID", - startCol: String = "start", - endCol: String = "end"): Event[ObservationPeriod] = { - - ObservationPeriod( - r.getAs[String](patientIDCol), - r.getAs[Timestamp](startCol), - r.getAs[Timestamp](endCol) - ) - } - - + /** Creates un Event object of type ObservationPeriod using a map function to map a dataset. + * + * @param patientID The value patientID from dataset. + * @param start The value start from dataset. + * @param end The value end from dataset. + * @return Event[ObservationPeriod]. + */ def apply(patientID: String, start: Timestamp, end: Timestamp): Event[ObservationPeriod] = Event(patientID, category, groupID = "NA", value = "NA", weight = 0D, start, Some(end)) -} +} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Outcome.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Outcome.scala index 3936817a..c49fdd44 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Outcome.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Outcome.scala @@ -3,7 +3,6 @@ package fr.polytechnique.cmap.cnam.etl.events import java.sql.Timestamp -import org.apache.spark.sql.Row object Outcome extends Outcome @@ -17,37 +16,9 @@ trait Outcome extends AnyEvent with EventBuilder { def apply(patientID: String, groupId: String, name: String, weight: Double, date: Timestamp): Event[Outcome] = Event(patientID, category, groupID = groupId, name, weight, date, None) - def fromRow( - r: Row, - patientIDCol: String = "patientID", - nameCol: String = "name", - dateCol: String = "eventDate"): Event[Outcome] = { - - Outcome( - r.getAs[String](patientIDCol), - r.getAs[String](nameCol), - r.getAs[Timestamp](dateCol) - ) - } - def apply(patientID: String, name: String, date: Timestamp): Event[Outcome] = Event(patientID, category, groupID = "NA", name, 0.0, date, None) - def fromRow( - r: Row, - patientIDCol: String, - nameCol: String, - weightCol: String, - dateCol: String): Event[Outcome] = { - - Outcome( - r.getAs[String](patientIDCol), - r.getAs[String](nameCol), - r.getAs[Double](weightCol), - r.getAs[Timestamp](dateCol) - ) - } - def apply(patientID: String, name: String, weight: Double, date: Timestamp): Event[Outcome] = Event(patientID, category, groupID = "NA", name, weight, date, None) } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/PractionnerClaimSpeciality.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/PractionnerClaimSpeciality.scala index 44ed946d..763549f6 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/PractionnerClaimSpeciality.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/PractionnerClaimSpeciality.scala @@ -3,25 +3,11 @@ package fr.polytechnique.cmap.cnam.etl.events import java.sql.Timestamp -import org.apache.spark.sql.Row trait PractitionerClaimSpeciality extends AnyEvent with EventBuilder { val category: EventCategory[PractitionerClaimSpeciality] - def fromRow( - r: Row, - patientIDCol: String = "patientID", - pfsIDCol: String = "groupID", - pfsSpeCol: String = "code", - dateCol: String = "eventDate"): Event[PractitionerClaimSpeciality] = - apply( - r.getAs[String](patientIDCol), - r.getAs[String](pfsIDCol), - r.getAs[String](pfsSpeCol), - r.getAs[Timestamp](dateCol) - ) - def apply(patientID: String, groupID: String, pfsSpe: String, date: Timestamp): Event[PractitionerClaimSpeciality] = { Event(patientID, category, groupID, pfsSpe, 0.0, date, None) } @@ -33,4 +19,25 @@ object MedicalPractitionerClaim extends PractitionerClaimSpeciality { object NonMedicalPractitionerClaim extends PractitionerClaimSpeciality { override val category: EventCategory[PractitionerClaimSpeciality] = "non_medical_practitioner_claim" -} \ No newline at end of file +} + +/** + * Tables of hospital services (FBSTC) and procedures (FCSTC) are not completed for each stay and are complementary. + * All the details are in the collaborative documentation on the SNDS here : + * https://documentation-snds.health-data-hub.fr/fiches/actes_consult_externes.html#reperage-des-ace-dans-la-table-des-prestations-dcir + */ +object McoCeFbstcMedicalPractitionerClaim extends PractitionerClaimSpeciality { + override val category: EventCategory[PractitionerClaimSpeciality] = "mco_ce__fbstc_practitioner_claim" +} + +object McoCeFcstcMedicalPractitionerClaim extends PractitionerClaimSpeciality { + override val category: EventCategory[PractitionerClaimSpeciality] = "mco_ce__fcstc_practitioner_claim" +} + +object SsrCeFbstcMedicalPractitionerClaim extends PractitionerClaimSpeciality { + override val category: EventCategory[PractitionerClaimSpeciality] = "ssr_ce__fbstc_practitioner_claim" +} + +object SsrCeFcstcMedicalPractitionerClaim extends PractitionerClaimSpeciality { + override val category: EventCategory[PractitionerClaimSpeciality] = "ssr_ce__fcstc_practitioner_claim" +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Trackloss.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Trackloss.scala index 8690e596..34964c05 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Trackloss.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/events/Trackloss.scala @@ -3,7 +3,6 @@ package fr.polytechnique.cmap.cnam.etl.events import java.sql.Timestamp -import org.apache.spark.sql.Row object Trackloss extends Trackloss @@ -11,14 +10,6 @@ trait Trackloss extends AnyEvent with EventBuilder { val category: EventCategory[Trackloss] = "trackloss" - def fromRow( - r: Row, - patientIDCol: String = "patientID", - dateCol: String = "eventDate"): Event[Trackloss] = { - - Trackloss(r.getAs[String](patientIDCol), r.getAs[Timestamp](dateCol)) - } - def apply(patientID: String, timestamp: Timestamp): Event[Trackloss] = { Event(patientID, category, groupID = "NA", "trackloss", 0.0, timestamp, None) } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/ColumnNames.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/ColumnNames.scala index 37769629..a70e8233 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/ColumnNames.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/ColumnNames.scala @@ -12,5 +12,4 @@ trait ColumnNames { implicit class RichColName(colName: ColName) { def toCol: Column = col(colName) } - } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/EventRowExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/EventRowExtractor.scala index e93814dd..d67a8105 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/EventRowExtractor.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/EventRowExtractor.scala @@ -5,16 +5,25 @@ package fr.polytechnique.cmap.cnam.etl.extractors import java.sql.Timestamp import org.apache.spark.sql.Row +/** + * Trait to be implemented to get all the information of fields. This usually implemented for simple Extractors. + * + * Provides default implementations for non groupId, value, weight and end. + */ trait EventRowExtractor { self: ColumnNames => def extractPatientId(r: Row): String + def extractStart(r: Row): Timestamp + def extractGroupId(r: Row): String = "NA" - def extractWeight(r: Row): Double = 0.0 + def extractValue(r: Row): String = "NA" - def extractStart(r: Row): Timestamp + def extractWeight(r: Row): Double = 0.0 def extractEnd(r: Row): Option[Timestamp] = None + + def usedColumns: List[String] = List.empty } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/Extractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/Extractor.scala index 016c59c2..385c6658 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/Extractor.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/Extractor.scala @@ -2,34 +2,66 @@ package fr.polytechnique.cmap.cnam.etl.extractors -import scala.reflect.runtime.universe._ +import scala.reflect.runtime.universe.TypeTag import org.apache.spark.sql.{DataFrame, Dataset, Row} import fr.polytechnique.cmap.cnam.etl.events.{AnyEvent, Event} +import fr.polytechnique.cmap.cnam.etl.extractors.codes.ExtractorCodes import fr.polytechnique.cmap.cnam.etl.sources.Sources -trait Extractor[EventType <: AnyEvent] extends Serializable { +trait Extractor[EventType <: AnyEvent, +Codes <: ExtractorCodes] extends Serializable { - def isInStudy(codes: Set[String])(row: Row): Boolean + def getCodes: Codes + /** Allows to check if the Row is considered in the current Study. + * + * @param row The row itself. + * @return A boolean value. + */ + def isInStudy(row: Row): Boolean + + + /** Checks if the passed Row has the information needed to build the Event. + * + * @param row The row itself. + * @return A boolean value. + */ def isInExtractorScope(row: Row): Boolean + /** Builds the Event. + * + * @param row The row itself. + * @return An event object. + */ def builder(row: Row): Seq[Event[EventType]] + /** Gets and prepares all the needed columns from the Sources. + * + * @param sources Source object [[Sources]] that contains all sources. + * @return A [[DataFrame]] with needed columns. + */ def getInput(sources: Sources): DataFrame - def extract(sources: Sources, codes: Set[String])(implicit ctag: TypeTag[EventType]): Dataset[Event[EventType]] = { + /** Extracts the Event from the Source. + * + * This function is responsible for gluing different other parts of the Extractor. + * This method should be considered the unique callable method from a Study perspective. + * + * @param sources Source object [[Sources]] that contains all sources. + * @param ctag An implicit parameter taken from EventType type. + * @return A Dataset of Events of type EventType. + */ + def extract(sources: Sources)(implicit ctag: TypeTag[EventType]): Dataset[Event[EventType]] = { val input: DataFrame = getInput(sources) import input.sqlContext.implicits._ { - if (codes.isEmpty) { + if (getCodes.isEmpty) { input.filter(isInExtractorScope _) } else { - input.filter(isInExtractorScope _).filter(isInStudy(codes) _) + input.filter(isInExtractorScope _).filter(isInStudy _) } - }.flatMap(builder _).distinct() + }.flatMap(builder).distinct() } - -} \ No newline at end of file +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/SimpleExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/SimpleExtractor.scala new file mode 100644 index 00000000..9db7a9ad --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/SimpleExtractor.scala @@ -0,0 +1,76 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors + +import org.apache.spark.sql.Row +import fr.polytechnique.cmap.cnam.etl.events.{AnyEvent, Event, EventBuilder} +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes + +/** + * Default Extractor implementation when the Extraction is simple. + * + * A simple Extractor is defined with the following characteristics: + * 1. The passed codes are of the type [[SimpleExtractorCodes]]. + * 2. Every field of an [[Event]] is mapped simply by implementing [[EventRowExtractor]]. + * 3. The `inStudy` method of [[Extractor]] is implemented through [[InStudyStrategy]] implementation. + * + * This trait has self type of [[EventRowExtractor]]. Thus every implementation must be a type [[EventRowExtractor]]. + * + * This trait defines two abstract methods: + * 1. [[EventBuilder]]: Factory that produces [[Event]] of type [[EventType]]. + * 2. columnName: name of the column that produces the value field of the [[Event]]. + * + * @tparam EventType Type of the [[Event]]. + */ +trait SimpleExtractor[EventType <: AnyEvent] extends Extractor[EventType, SimpleExtractorCodes] { + self: EventRowExtractor => + + // Abstract methods of this trait + def columnName: String + + def eventBuilder: EventBuilder + + // used in the getInput method to select the columns. + def neededColumns: List[String] = columnName :: self.usedColumns + + // Unique method implementation of this trait of EventRowExtractor + override def extractValue(row: Row): String = row.getAs[String](columnName) + + // Implementation of the Extractor trait + def isInExtractorScope(row: Row): Boolean = !row.isNullAt(row.fieldIndex(columnName)) + + def builder(row: Row): Seq[Event[EventType]] = { + lazy val patientId = extractPatientId(row) + lazy val groupId = extractGroupId(row) + lazy val value = extractValue(row) + lazy val eventDate = extractStart(row) + lazy val endDate = extractEnd(row) + lazy val weight = extractWeight(row) + + Seq(eventBuilder[EventType](patientId, groupId, value, weight, eventDate, endDate)) + } +} + +/** + * Defines the "inStudy" method of [[SimpleExtractor]]. + * @tparam EventType Type of Event to be extracted in the self typed [[SimpleExtractor]]. + */ +sealed trait InStudyStrategy[EventType <: AnyEvent] { + self: SimpleExtractor[EventType] => + override def isInStudy(row: Row): Boolean +} + +trait AlwaysTrueStrategy[EventType <: AnyEvent] extends InStudyStrategy[EventType] { + self: SimpleExtractor[EventType] => + def isInStudy(row: Row): Boolean = true +} + +trait IsInStrategy[EventType <: AnyEvent] extends InStudyStrategy[EventType] { + self: SimpleExtractor[EventType] => + def isInStudy(row: Row): Boolean = getCodes.contains(extractValue(row)) +} + +trait StartsWithStrategy[EventType <: AnyEvent] extends InStudyStrategy[EventType] { + self: SimpleExtractor[EventType] => + def isInStudy(row: Row): Boolean = getCodes.exists(extractValue(row).startsWith) +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/DcirMedicalActExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/DcirMedicalActExtractor.scala deleted file mode 100644 index 8c642ee5..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/DcirMedicalActExtractor.scala +++ /dev/null @@ -1,80 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors.acts - -import scala.util.Try -import org.apache.spark.sql.{DataFrame, Row} -import fr.polytechnique.cmap.cnam.etl.events.{BiologyDcirAct, DcirAct, EventBuilder, MedicalAct} -import fr.polytechnique.cmap.cnam.etl.extractors.dcir.DcirExtractor -import fr.polytechnique.cmap.cnam.etl.sources.Sources - -trait DcirActExtractor extends DcirExtractor[MedicalAct] { - - private final val PrivateInstitutionCodes = Set(4D, 5D, 6D, 7D) - - override def extractGroupId(r: Row): String = { - getGroupId(r) recover { case _: IllegalArgumentException => DcirAct.groupID.DcirAct } - }.get - - /** - * Get the information of the origin of DCIR act that is being extracted. It returns a - * Failure[IllegalArgumentException] if the DCIR schema is old, a success if the DCIR schema contains an information. - * - * @param r the row of DCIR to be investigated. - * @return Try[String] - */ - def getGroupId(r: Row): Try[String] = Try { - - if (!r.isNullAt(r.fieldIndex(ColNames.Sector)) && getSector(r) == 1) { - DcirAct.groupID.PublicAmbulatory - } - else { - if (r.isNullAt(r.fieldIndex(ColNames.GHSCode))) { - DcirAct.groupID.Liberal - } else { - // Value is not at null, it is not liberal - lazy val ghs = getGHS(r) - lazy val institutionCode = getInstitutionCode(r) - // Check if it is a private ambulatory - if (ghs == 0 && PrivateInstitutionCodes.contains(institutionCode)) { - DcirAct.groupID.PrivateAmbulatory - } - else { - DcirAct.groupID.Unknown - } - } - } - } - - def getGHS(r: Row): Double = r.getAs[Double](ColNames.GHSCode) - - def getInstitutionCode(r: Row): Double = r.getAs[Double](ColNames.InstitutionCode) - - def getSector(r: Row): Double = r.getAs[Double](ColNames.Sector) - - override def extractWeight(r: Row): Double = 1.0 - -} - - -object DcirMedicalActExtractor extends DcirActExtractor { - override val columnName: String = ColNames.CamCode - override val eventBuilder: EventBuilder = DcirAct - - override def getInput(sources: Sources): DataFrame = sources.dcir.get.select( - ColNames.PatientID, ColNames.CamCode, ColNames.Date, - ColNames.InstitutionCode, ColNames.GHSCode, ColNames.Sector - ) -} - - -object DcirBiologyActExtractor extends DcirActExtractor { - override val columnName: String = ColNames.BioCode - override val eventBuilder: EventBuilder = BiologyDcirAct - override def code = (row: Row) => row.getAs[Double](columnName).toString - - override def getInput(sources: Sources): DataFrame = sources.dcir.get.select( - ColNames.PatientID, ColNames.BioCode, ColNames.Date, - ColNames.InstitutionCode, ColNames.GHSCode, ColNames.Sector - ) -} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/HadActExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/HadActExtractor.scala deleted file mode 100644 index ed83b17b..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/HadActExtractor.scala +++ /dev/null @@ -1,12 +0,0 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.acts - -import fr.polytechnique.cmap.cnam.etl.events._ -import fr.polytechnique.cmap.cnam.etl.extractors.had.HadExtractor -import org.apache.spark.sql.Row - -object HadCcamActExtractor extends HadExtractor[MedicalAct] { - final override val columnName: String = ColNames.CCAM - override val eventBuilder: EventBuilder = HadCCAMAct -} - - diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/McoActExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/McoActExtractor.scala deleted file mode 100644 index f9ca1098..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/McoActExtractor.scala +++ /dev/null @@ -1,16 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors.acts - -import fr.polytechnique.cmap.cnam.etl.events._ -import fr.polytechnique.cmap.cnam.etl.extractors.mco.McoExtractor - -object McoCcamActExtractor extends McoExtractor[MedicalAct] { - final override val columnName: String = ColNames.CCAM - override val eventBuilder: EventBuilder = McoCCAMAct -} - -object McoCimMedicalActExtractor extends McoExtractor[MedicalAct] { - final override val columnName: String = ColNames.DP - override val eventBuilder: EventBuilder = McoCIM10Act -} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/McoCeActExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/McoCeActExtractor.scala deleted file mode 100644 index 0793353c..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/McoCeActExtractor.scala +++ /dev/null @@ -1,46 +0,0 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.acts - -import java.sql.{Date, Timestamp} -import org.apache.spark.sql.{DataFrame, Row, functions} -import fr.polytechnique.cmap.cnam.etl.events.{Event, McoCEAct, MedicalAct} -import fr.polytechnique.cmap.cnam.etl.extractors.Extractor -import fr.polytechnique.cmap.cnam.etl.sources.Sources -import fr.polytechnique.cmap.cnam.util.datetime.implicits._ - -object McoCeActExtractor extends Extractor[MedicalAct] with McoCeSourceExtractor { - override def isInStudy(codes: Set[String]) - (row: Row): Boolean = codes.exists(getCode(row).startsWith(_)) - - override def isInExtractorScope(row: Row): Boolean = !isNullAt(ColNames.CamCode)(row) - - override def builder(row: Row): Seq[Event[MedicalAct]] = { - lazy val patientID = getPatientID(row) - lazy val date = getDate(row) - lazy val code = getCode(row) - - Seq(McoCEAct(patientID, "ACE", code, date)) - } - - override def getInput(sources: Sources): DataFrame = - sources.mcoCe.get.select(ColNames.all.map(functions.col): _*) -} - - -trait McoCeSourceExtractor { - - def getPatientID(row: Row): String = row.getAs[String](ColNames.PatientID) - - def getDate(row: Row): Timestamp = row.getAs[Date](ColNames.Date).toTimestamp - - def getCode(row: Row): String = row.getAs[String](ColNames.CamCode) - - def isNullAt(colName: String)(row: Row): Boolean = row.isNullAt(row.fieldIndex(colName)) - - final object ColNames extends Serializable { - final lazy val PatientID = "NUM_ENQ" - final lazy val CamCode = "MCO_FMSTC__CCAM_COD" - final lazy val Date = "EXE_SOI_DTD" - final lazy val all = List(PatientID, CamCode, Date) - } - -} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/SsrActExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/SsrActExtractor.scala deleted file mode 100644 index 1db7434f..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/SsrActExtractor.scala +++ /dev/null @@ -1,27 +0,0 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.acts - -import fr.polytechnique.cmap.cnam.etl.events._ -import fr.polytechnique.cmap.cnam.etl.extractors.ssr.SsrExtractor -import org.apache.spark.sql.Row - -object SsrCcamActExtractor extends SsrExtractor[MedicalAct] { - final override val columnName: String = ColNames.CCAM - override val eventBuilder: EventBuilder = SsrCCAMAct -} - -/** Extract Csarr codes : - * - * The Specific Catalogue of Acts of Rehabilitation and Rehabilitation (CSARR) is intended to - * describe and code the activity of the professionals concerned in follow-up care and - * rehabilitation establishments (SSR). These acts are to be distinguished from CCAM acts which - * are the sole responsibility of the doctor. - * - * This terminology is of the form `AAA+111`, eg. *GKQ+139 : Évaluation initiale du langage écrit* - * - * The complete terminology can be found here : https://drees.shinyapps.io/dico-snds/?variable=FP_PEC&search=csar&table=T_SSRaa_nnB - * For more details see : https://www.atih.sante.fr/sites/default/files/public/content/3302/csarr_2018.pdf - */ -object SsrCsarrActExtractor extends SsrExtractor[MedicalAct] { - final override val columnName: String = ColNames.CSARR - override val eventBuilder: EventBuilder = SsrCSARRAct -} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/classifications/GhmExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/classifications/GhmExtractor.scala deleted file mode 100644 index ac730ae8..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/classifications/GhmExtractor.scala +++ /dev/null @@ -1,11 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors.classifications - -import fr.polytechnique.cmap.cnam.etl.events.{Classification, EventBuilder, GHMClassification} -import fr.polytechnique.cmap.cnam.etl.extractors.mco.McoExtractor - -object GhmExtractor extends McoExtractor[Classification] { - final override val columnName: String = ColNames.GHM - override val eventBuilder: EventBuilder = GHMClassification -} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/codes/ExtractorCodes.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/codes/ExtractorCodes.scala new file mode 100644 index 00000000..93a065fb --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/codes/ExtractorCodes.scala @@ -0,0 +1,7 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.codes + +trait ExtractorCodes extends Serializable { + def isEmpty: Boolean +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/codes/SimpleExtractorCodes.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/codes/SimpleExtractorCodes.scala new file mode 100644 index 00000000..9a931909 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/codes/SimpleExtractorCodes.scala @@ -0,0 +1,21 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.codes + +import scala.collection.immutable.HashSet + +class SimpleExtractorCodes(val codes: List[String]) extends ExtractorCodes { + val internalCodes: HashSet[String] = codes.to[HashSet] + + override def isEmpty: Boolean = internalCodes.isEmpty + + def exists(p: String => Boolean): Boolean = internalCodes.exists(p) + + def contains(code: String): Boolean = internalCodes.contains(code) +} + +object SimpleExtractorCodes { + def empty = new SimpleExtractorCodes(List.empty) + + def apply(codes: List[String]): SimpleExtractorCodes = new SimpleExtractorCodes(codes) +} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/dcir/DcirExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/dcir/DcirExtractor.scala deleted file mode 100644 index 8ccf21d4..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/dcir/DcirExtractor.scala +++ /dev/null @@ -1,62 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors.dcir - -import java.sql.Timestamp -import org.apache.commons.codec.binary.Base64 -import org.apache.spark.sql.functions.col -import org.apache.spark.sql.{DataFrame, Row} -import fr.polytechnique.cmap.cnam.etl.events.{AnyEvent, Event, EventBuilder} -import fr.polytechnique.cmap.cnam.etl.extractors.{EventRowExtractor, Extractor} -import fr.polytechnique.cmap.cnam.etl.sources.Sources -import fr.polytechnique.cmap.cnam.util.datetime.implicits._ - -trait DcirExtractor[EventType <: AnyEvent] extends Extractor[EventType] with DcirSource with EventRowExtractor { - - val columnName: String - - val eventBuilder: EventBuilder - - def getInput(sources: Sources): DataFrame = sources.dcir.get.select(ColNames.all.map(col): _*) - - def isInStudy(codes: Set[String]) - (row: Row): Boolean = codes.exists(code(row).startsWith(_)) - - def isInExtractorScope(row: Row): Boolean = !row.isNullAt(row.fieldIndex(columnName)) - - def builder(row: Row): Seq[Event[EventType]] = { - lazy val patientId = extractPatientId(row) - lazy val groupId = extractGroupId(row) - lazy val eventDate = extractStart(row) - lazy val endDate = extractEnd(row) - lazy val weight = extractWeight(row) - - Seq(eventBuilder[EventType](patientId, groupId, code(row), weight, eventDate, endDate)) - } - - def code = (row: Row) => row.getAs[String](columnName) - - def extractPatientId(r: Row): String = { - r.getAs[String](ColNames.PatientID) - } - - def extractStart(r: Row): Timestamp = r.getAs[java.util.Date](ColNames.Date).toTimestamp - - def extractFluxDate(r: Row): Timestamp = r.getAs[java.util.Date](ColNames.DcirFluxDate).toTimestamp - - override def extractGroupId(r: Row): String = { - Base64.encodeBase64(s"${r.getAs[String](ColNames.DateStart)}_${r.getAs[String](ColNames.DateEntry)}_${ - r.getAs[String]( - ColNames - .EmitterType - ) - }_${r.getAs[String](ColNames.EmitterId)}_${r.getAs[String](ColNames.FlowNumber)}_${ - r.getAs[String]( - ColNames - .OrgId - ) - }_${r.getAs[String](ColNames.OrderId)}".getBytes()).map(_.toChar).mkString - - - } -} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/HadDiagnosisExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/HadDiagnosisExtractor.scala deleted file mode 100644 index 17fa996a..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/HadDiagnosisExtractor.scala +++ /dev/null @@ -1,14 +0,0 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.diagnoses - -import fr.polytechnique.cmap.cnam.etl.events._ -import fr.polytechnique.cmap.cnam.etl.extractors.had.HadExtractor - -object HadMainDiagnosisExtractor extends HadExtractor[Diagnosis] { - final override val columnName: String = ColNames.DP - override val eventBuilder: EventBuilder = HadMainDiagnosis -} - -object HadAssociatedDiagnosisExtractor extends HadExtractor[Diagnosis] { - final override val columnName: String = ColNames.DA - override val eventBuilder: EventBuilder = HadAssociatedDiagnosis -} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/ImbDiagnosisExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/ImbDiagnosisExtractor.scala deleted file mode 100644 index a26452df..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/ImbDiagnosisExtractor.scala +++ /dev/null @@ -1,49 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors.diagnoses - -import java.sql.{Date, Timestamp} -import org.apache.spark.sql.{DataFrame, Row} -import fr.polytechnique.cmap.cnam.etl.events.{Diagnosis, Event, ImbDiagnosis} -import fr.polytechnique.cmap.cnam.etl.extractors.Extractor -import fr.polytechnique.cmap.cnam.etl.sources.Sources -import fr.polytechnique.cmap.cnam.util.datetime - -object ImbDiagnosisExtractor extends Extractor[Diagnosis] with ImbSource { - - override def isInExtractorScope(row: Row): Boolean = { - lazy val idx = row.fieldIndex(ColNames.Code) - getEncoding(row) == "CIM10" || !row.isNullAt(idx) - } - - override def isInStudy(codes: Set[String]) - (row: Row): Boolean = codes.exists(getCode(row).startsWith(_)) - - override def builder(row: Row): Seq[Event[Diagnosis]] = - Seq(ImbDiagnosis(getPatientID(row), getCode(row), getEventDate(row))) - - override def getInput(sources: Sources): DataFrame = sources.irImb.get -} - -trait ImbSource extends Serializable { - - lazy val getCode = (row: Row) => row.getAs[String](ColNames.Code) - - def getEncoding(row: Row): String = row.getAs[String](ColNames.Encoding) - - def getPatientID(row: Row): String = row.getAs[String](ColNames.PatientID) - - def getEventDate(row: Row): Timestamp = { - import datetime.implicits._ - - row.getAs[Date](ColNames.Date).toTimestamp - } - - final object ColNames extends Serializable { - final lazy val PatientID = "NUM_ENQ" - final lazy val Encoding = "MED_NCL_IDT" - final lazy val Code = "MED_MTF_COD" - final lazy val Date = "IMB_ALD_DTD" - } - -} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/McoDiagnosisExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/McoDiagnosisExtractor.scala deleted file mode 100644 index 88cf61df..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/McoDiagnosisExtractor.scala +++ /dev/null @@ -1,25 +0,0 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.diagnoses - -import fr.polytechnique.cmap.cnam.etl.events._ -import fr.polytechnique.cmap.cnam.etl.extractors.mco.McoExtractor - -class McoMainDiagnosisExtractor extends McoExtractor[Diagnosis] { - final override val columnName: String = ColNames.DP - override val eventBuilder: EventBuilder = McoMainDiagnosis -} - -object McoMainDiagnosisExtractor extends McoMainDiagnosisExtractor - -class McoAssociatedDiagnosisExtractor extends McoExtractor[Diagnosis] { - final override val columnName: String = ColNames.DA - override val eventBuilder: EventBuilder = McoAssociatedDiagnosis -} - -object McoAssociatedDiagnosisExtractor extends McoAssociatedDiagnosisExtractor - -class McoLinkedDiagnosisExtractor extends McoExtractor[Diagnosis] { - final override val columnName: String = ColNames.DR - override val eventBuilder: EventBuilder = McoLinkedDiagnosis -} - -object McoLinkedDiagnosisExtractor extends McoLinkedDiagnosisExtractor diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/SsrDiagnosisExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/SsrDiagnosisExtractor.scala deleted file mode 100644 index 8438e494..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/SsrDiagnosisExtractor.scala +++ /dev/null @@ -1,24 +0,0 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.diagnoses - -import fr.polytechnique.cmap.cnam.etl.events._ -import fr.polytechnique.cmap.cnam.etl.extractors.ssr.SsrExtractor - -object SsrMainDiagnosisExtractor extends SsrExtractor[Diagnosis] { - final override val columnName: String = ColNames.DP - override val eventBuilder: EventBuilder = SsrMainDiagnosis -} - -object SsrAssociatedDiagnosisExtractor extends SsrExtractor[Diagnosis] { - final override val columnName: String = ColNames.DA - override val eventBuilder: EventBuilder = SsrAssociatedDiagnosis -} - -object SsrLinkedDiagnosisExtractor extends SsrExtractor[Diagnosis] { - final override val columnName: String = ColNames.DR - override val eventBuilder: EventBuilder = SsrLinkedDiagnosis -} - -object SsrTakingOverPurposeExtractor extends SsrExtractor[Diagnosis] { - final override val columnName: String = ColNames.FP_PEC - override val eventBuilder: EventBuilder = SsrTakingOverPurpose -} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/DrugConfig.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/DrugConfig.scala deleted file mode 100644 index 0db3c570..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/DrugConfig.scala +++ /dev/null @@ -1,17 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors.drugs - -import fr.polytechnique.cmap.cnam.etl.extractors.ExtractorConfig -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.DrugClassConfig -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.level.DrugClassificationLevel - -class DrugConfig( - val level: DrugClassificationLevel, - val families: List[DrugClassConfig]) extends ExtractorConfig with Serializable - -object DrugConfig { - def apply(level: DrugClassificationLevel, families: List[DrugClassConfig]): DrugConfig = new DrugConfig( - level, families - ) -} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/Opioids.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/Opioids.scala deleted file mode 100644 index 8e05e086..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/Opioids.scala +++ /dev/null @@ -1,480 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.families - -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} - -object Opioids extends DrugClassConfig { - override val name: String = "Opioids" - override val cip13Codes: Set[String] = Set( - "3400938747393", - "3400939023649", - "3400935770073", - "3400936114548", - "3400936114487", - "3400934828096", - "3400937479462", - "3400937489928", - "3400937484145", - "3400934992308", - "3400934990007", - "3400934951374", - "3400930378540", - "3400933715014", - "3400935641281", - "3400930411469", - "3400927319143", - "3400936853690", - "3400934314797", - "3400934314568", - "3400934314278", - "3400934808166", - "3400934802362", - "3400936102132", - "3400926845926", - "3400938229783", - "3400936853812", - "3400935793195", - "3400926980801", - "3400926797348", - "3400936319714", - "3400927671029", - "3400939104874", - "3400935877154", - "3400935106704", - "3400934538193", - "3400939221601", - "3400939221540", - "3400935620699", - "3400936907553", - "3400934991295", - "3400935620170", - "3400930411230", - "3400931959328", - "3400936206830", - "3400936248182", - "3400936247123", - "3400936970366", - "3400935067340", - "3400935349897", - "3400938127676", - "3400938227024", - "3400931959038", - "3400931958956", - "3400935486554", - "3400931959496", - "3400935156440", - "3400935155788", - "3400938231045", - "3400936969247", - "3400936968936", - "3400936672598", - "3400935438478", - "3400927321214", - "3400938228373", - "3400936247642", - "3400936206779", - "3400936102651", - "3400927607035", - "3400927562396", - "3400927597237", - "3400935888730", - "3400936289406", - "3400936289284", - "3400938399097", - "3400939826240", - "3400936969995", - "3400938222920", - "3400936812420", - "3400936810709", - "3400936809758", - "3400936748804", - "3400933220754", - "3400933304652", - "3400933319090", - "3400934976339", - "3400934976278", - "3400936041295", - "3400936041417", - "3400938212983", - "3400933316778", - "3400935486783", - "3400935404640", - "3400934499760", - "3400936289635", - "3400935567031", - "3400935438249", - "3400922096032", - "3400922096322", - "3400936141391", - "3400935748713", - "3400935748652", - "3400936787193", - "3400934654329", - "3400934654039", - "3400938552454", - "3400935350039", - "3400937356473", - "3400938675443", - "3400938675214", - "3400935349958", - "3400937356305", - "3400937700177", - "3400935856913", - "3400934654787", - "3400934654558", - "3400930303320", - "3400934007286", - "3400935194121", - "3400934882845", - "3400935877505", - "3400935193810", - "3400937051323", - "3400937015943", - "3400933180850", - "3400930002285", - "3400930002278", - "3400926690908", - "3400926963422", - "3400938459852", - "3400939846460", - "3400934890659", - "3400939392325", - "3400934890888", - "3400933911881", - "3400934802133", - "3400938042887", - "3400938042597", - "3400939903392", - "3400939641706", - "3400939711874", - "3400938042139", - "3400921856996", - "3400921856828", - "3400922198088", - "3400921857658", - "3400921879599", - "3400949251087", - "3400949250547", - "3400949914821", - "3400949217540", - "3400949217250", - "3400949217199", - "3400930332016", - "3400939202099", - "3400939641355", - "3400939640983", - "3400939711935", - "3400938510232", - "3400927560446", - "3400927658532", - "3400927657702", - "3400934410932", - "3400934399435", - "3400934399084", - "3400934398544", - "3400934291463", - "3400934291234", - "3400934387609", - "3400934387258", - "3400927656989", - "3400927748752", - "3400927760587", - "3400927759239", - "3400927757976", - "3400939220710", - "3400938509342", - "3400938509861", - "3400934890420", - "3400932869947", - "3400927659591", - "3400938458732", - "3400935438300", - "3400936587281", - "3400936985155", - "3400935695161", - "3400936911635", - "3400936911055", - "3400936910683", - "3400939200668", - "3400939104355", - "3400938460223", - "3400938533415", - "3400938509113", - "3400938124255", - "3400939024479", - "3400938398618", - "3400926961121", - "3400939104584", - "3400939104416", - "3400939105765", - "3400939105185", - "3400939104935", - "3400939104706", - "3400934053702", - "3400938508741", - "3400938508512", - "3400938652840", - "3400938652321", - "3400927756337", - "3400927753725", - "3400927751653", - "3400939221021", - "3400927755095", - "3400931164531", - "3400934238536", - "3400934238307", - "3400934021305", - "3400934641541", - "3400922301938", - "3400934238475", - "3400936910515", - "3400936102422", - "3400936894563", - "3400934866067", - "3400935322982", - "3400922096261", - "3400935349729", - "3400949363940", - "3400939190051", - "3400936907782", - "3400936906891", - "3400936906372", - "3400936906143", - "3400936905771", - "3400935807366", - "3400935806826", - "3400927890031", - "3400938651089", - "3400938650488", - "3400938225532", - "3400936794238", - "3400926942489", - "3400935857392", - "3400935154729", - "3400939200439", - "3400934387487", - "3400933684792", - "3400934748042", - "3400939104645", - "3400939221489", - "3400939221311", - "3400939221250", - "3400939221199", - "3400939220888", - "3400939213507", - "3400939213446", - "3400932952724", - "3400932870028", - "3400932869718", - "3400922303420", - "3400922303079", - "3400922302768", - "3400922302300", - "3400921856477", - "3400935703477", - "3400930777473", - "3400935018663", - "3400930777534", - "3400933323813", - "3400933323752", - "3400935130594", - "3400939186788", - "3400936910454", - "3400936910225", - "3400949217489", - "3400939187679", - "3400935998651", - "3400932869886", - "3400949914531", - "3400934890130", - "3400927560965", - "3400927561337", - "3400927888601", - "3400935806307", - "3400935349378", - "3400939185668", - "3400939726205", - "3400939725192", - "3400935619921", - "3400935844217", - "3400935843845", - "3400936203068", - "3400936242388", - "3400936242159", - "3400936241909", - "3400935422279", - "3400927656750", - "3400939314952", - "3400939712017", - "3400921857948", - "3400949251209", - "3400949378678", - "3400934827846", - "3400922095950", - "3400936809178", - "3400930068571", - "3400921857139", - "3400949812134", - "3400949666621", - "3400949666331", - "3400949915590", - "3400933323691", - "3400933323523", - "3400931164821", - "3400939391144", - "3400939390543", - "3400939342757", - "3400939844510", - "3400930068519", - "3400930068649", - "3400921857887", - "3400921857719", - "3400921857597", - "3400921856767", - "3400921856538", - "3400921857368", - "3400921857078", - "3400949914302", - "3400938504897", - "3400935429483", - "3400935421500", - "3400933480059", - "3400933479978", - "3400933799229", - "3400927889721", - "3400927889370", - "3400927889080", - "3400927561108", - "3400939827070", - "3400935595874", - "3400935615558", - "3400936651548", - "3400934760228", - "3400939755588", - "3400939755878", - "3400939476803", - "3400939417899", - "3400939417370", - "3400932551897", - "3400930075722", - "3400933803681", - "3400939478173", - "3400939476223", - "3400939479415", - "3400935714244", - "3400939118833", - "3400926939359", - "3400927656699", - "3400935509369", - "3400936587632", - "3400930051047", - "3400933803452", - "3400939477404", - "3400939825649", - "3400930057834", - "3400932966332", - "3400935185884", - "3400935107534", - "3400935660091", - "3400935108074", - "3400935107183", - "3400935671677", - "3400930068960", - "3400930068892", - "3400927657641", - "3400926845117", - "3400927656811", - "3400927658471", - "3400927655630", - "3400927655920", - "3400927659423", - "3400930068823", - "3400935982452", - "3400927656002", - "3400930076033", - "3400930075937", - "3400930075623", - "3400937847698", - "3400934827617", - "3400935703767", - "3400930587508", - "3400935065100", - "3400935703248", - "3400921879711", - "3400936969537", - "3400936595965", - "3400949214587", - "3400936819344", - "3400935531650", - "3400931308492", - "3400939478814", - "3400939213736", - "3400921857429", - "3400936289055", - "3400935235893", - "3400930075388", - "3400934170195", - "3400933305314", - "3400932461219", - "3400939391892", - "3400939843629", - "3400927660313", - "3400927560675", - "3400930587737", - "3400935299130", - "3400938324747", - "3400933724467", - "3400938149500", - "3400936690349", - "3400935157041", - "3400935236036", - "3400936968646", - "3400937374514", - "3400938509571", - "3400927513541", - "3400935414007", - "3400926721466", - "3400926838072", - "3400935583239", - "3400935404701", - "3400935565839", - "3400930014103", - "3400930045350", - "3400939699257", - "3400938458442", - "3400935420909", - "3400927655869", - "3400935713063", - "3400933275815", - "3400937015653", - "3400949217311", - "3400939200729", - "3400934300660", - "3400936212053", - "3400938125955", - "3400935570970", - "3400936289925", - "3400939845340", - "3400936906204", - "3400936105034", - "3400939723471", - "3400935666826", - "3400935843494", - "3400935768872", - "3400938508970", - "3400938651720", - "3400936853751", - "3400939104294", - "3400934238765", - "3400949363599", - "3400939220949", - "3400933765910", - "3400927660252", - "3400935248442", - "3400926943141", - "3400935694911", - "3400939185897" - ) - override val pharmacologicalClasses: List[PharmacologicalClassConfig] = List.empty -} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/ProtonPumpInhibitors.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/ProtonPumpInhibitors.scala deleted file mode 100644 index 1ff4a3db..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/ProtonPumpInhibitors.scala +++ /dev/null @@ -1,968 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.families - -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} - -object ProtonPumpInhibitors extends DrugClassConfig{ - override val name: String = "ProtonPumpInhibitors" - override val cip13Codes: Set[String] = Set( - "3400949001279", - "3400938104905", - "3400938268706", - "3400938278170", - "3400938277920", - "3400938277579", - "3400938277340", - "3400949766208", - "3400949015856", - "3400949015795", - "3400949337118", - "3400949336456", - "3400949001637", - "3400949011834", - "3400938103205", - "3400949011544", - "3400936291706", - "3400936291584", - "3400936290754", - "3400936287792", - "3400936288515", - "3400936287914", - "3400934868139", - "3400941787461", - "3400949337286", - "3400927818172", - "3400927823206", - "3400922392486", - "3400939074566", - "3400936747333", - "3400936884397", - "3400949014217", - "3400938104035", - "3400938144307", - "3400949001064", - "3400949498765", - "3400941889578", - "3400949000500", - "3400949001057", - "3400941788642", - "3400941783388", - "3400949001033", - "3400949001026", - "3400949001019", - "3400938954401", - "3400937530880", - "3400936483163", - "3400936676732", - "3400936674431", - "3400936673779", - "3400936673250", - "3400921838329", - "3400937827997", - "3400938782462", - "3400936747272", - "3400921854817", - "3400938112887", - "3400949946419", - "3400949968879", - "3400949941155", - "3400938106107", - "3400941859014", - "3400927481741", - "3400939307688", - "3400921850215", - "3400949117154", - "3400949117093", - "3400949116263", - "3400941902109", - "3400941889349", - "3400941888168", - "3400936887411", - "3400936887350", - "3400936884168", - "3400936883918", - "3400936883857", - "3400936805446", - "3400936805217", - "3400936676961", - "3400921992304", - "3400949004799", - "3400949003440", - "3400941787522", - "3400941607981", - "3400941784569", - "3400921994773", - "3400949008643", - "3400936804784", - "3400941904868", - "3400938119343", - "3400930003060", - "3400938142693", - "3400938140163", - "3400938112139", - "3400938112078", - "3400938109986", - "3400938109696", - "3400936483514", - "3400941611834", - "3400941610073", - "3400941609244", - "3400941608292", - "3400949008124", - "3400949007523", - "3400938106336", - "3400949015917", - "3400936747562", - "3400935674289", - "3400936281820", - "3400935842374", - "3400935974419", - "3400936284432", - "3400936886759", - "3400949001330", - "3400949000517", - "3400939213156", - "3400939524689", - "3400936883796", - "3400936282940", - "3400939072784", - "3400922389875", - "3400938118452", - "3400936804036", - "3400936483682", - "3400938117103", - "3400949761005", - "3400949946358", - "3400949001354", - "3400949001347", - "3400941781209", - "3400941781087", - "3400941780837", - "3400938117790", - "3400922489223", - "3400927607493", - "3400926663872", - "3400949699858", - "3400949970889", - "3400936288744", - "3400934081217", - "3400934130106", - "3400934052750", - "3400934455186", - "3400934455247", - "3400926957278", - "3400927568022", - "3400927567711", - "3400927567650", - "3400922296005", - "3400934415555", - "3400934415494", - "3400930011638", - "3400921927948", - "3400921928198", - "3400921848083", - "3400921847543", - "3400926955786", - "3400926954086", - "3400935585011", - "3400930060735", - "3400930060704", - "3400922295923", - "3400922296463", - "3400922197609", - "3400921836776", - "3400921611212", - "3400921633672", - "3400933839352", - "3400934869259", - "3400934869020", - "3400933839291", - "3400933839871", - "3400930011515", - "3400930002926", - "3400930011508", - "3400930060193", - "3400926823405", - "3400926823283", - "3400934963841", - "3400934475757", - "3400934868429", - "3400922197548", - "3400922197319", - "3400922403526", - "3400922095431", - "3400930060179", - "3400930060698", - "3400930060247", - "3400921636345", - "3400921634914", - "3400935674401", - "3400936862746", - "3400949762644", - "3400949719181", - "3400926816889", - "3400926815530", - "3400926957568", - "3400927359583", - "3400927359415", - "3400922037820", - "3400922037530", - "3400922037479", - "3400939030272", - "3400933341213", - "3400933341152", - "3400936323094", - "3400935533951", - "3400930066409", - "3400930066386", - "3400930066379", - "3400935585479", - "3400935584939", - "3400935584588", - "3400935533432", - "3400935533203", - "3400935585301", - "3400922037240", - "3400922036069", - "3400922034287", - "3400922033808", - "3400935584298", - "3400934706639", - "3400934081095", - "3400921849844", - "3400921852516", - "3400921852165", - "3400935584700", - "3400935584410", - "3400936282360", - "3400937991339", - "3400930070604", - "3400930070598", - "3400930070574", - "3400934456077", - "3400936871731", - "3400936871670", - "3400936871441", - "3400934410413", - "3400934303562", - "3400934081446", - "3400921928259", - "3400921995954", - "3400921994315", - "3400921998627", - "3400922000459", - "3400921993653", - "3400921992823", - "3400949766376", - "3400936668119", - "3400936282889", - "3400936805965", - "3400922463438", - "3400926955908", - "3400939389653", - "3400938778502", - "3400938776379", - "3400949721542", - "3400949721481", - "3400938991598", - "3400938277869", - "3400927932694", - "3400936288683", - "3400949766666", - "3400949718061", - "3400949755370", - "3400949494514", - "3400949759804", - "3400949495573", - "3400927823374", - "3400949718412", - "3400949718351", - "3400949702091", - "3400949701612", - "3400949494743", - "3400949757152", - "3400949758791", - "3400949758562", - "3400949757381", - "3400949487189", - "3400949970421", - "3400949946297", - "3400949946129", - "3400927931925", - "3400927931116", - "3400927930683", - "3400927871399", - "3400927871221", - "3400927929854", - "3400935881694", - "3400939307749", - "3400939307510", - "3400939255439", - "3400939541778", - "3400939271286", - "3400939322018", - "3400949754021", - "3400938703979", - "3400938703740", - "3400939075228", - "3400939030623", - "3400939073095", - "3400939030562", - "3400938117561", - "3400938117271", - "3400938119633", - "3400938119572", - "3400938112948", - "3400937827188", - "3400938118681", - "3400939321646", - "3400936287105", - "3400936287044", - "3400934549021", - "3400934706868", - "3400934705519", - "3400934097935", - "3400939030333", - "3400934455766", - "3400938991710", - "3400934705229", - "3400941784279", - "3400939072555", - "3400949003099", - "3400941787751", - "3400934130045", - "3400927931284", - "3400927929915", - "3400927366796", - "3400926665654", - "3400927418358", - "3400939239699", - "3400939321936", - "3400937991278", - "3400949719532", - "3400949718870", - "3400930011669", - "3400927607264", - "3400935533371", - "3400949968411", - "3400936284661", - "3400938193503", - "3400938108866", - "3400934606533", - "3400938104844", - "3400922402406", - "3400949753949", - "3400949718122", - "3400949498536", - "3400936287273", - "3400949755202", - "3400949719471", - "3400926953775", - "3400949946068", - "3400949945986", - "3400949946877", - "3400949946709", - "3400949946648", - "3400949941445", - "3400922462776", - "3400922465210", - "3400926668327", - "3400926667894", - "3400926661281", - "3400926663414", - "3400939541198", - "3400939320816", - "3400939320755", - "3400939213385", - "3400939213217", - "3400939213095", - "3400939239989", - "3400949767786", - "3400939030104", - "3400939072326", - "3400938781052", - "3400938780802", - "3400938778380", - "3400938785364", - "3400949939893", - "3400938118513", - "3400938118162", - "3400938194333", - "3400937974998", - "3400938194104", - "3400938193732", - "3400939541549", - "3400938784015", - "3400938186529", - "3400922403816", - "3400926951306", - "3400935514790", - "3400936290525", - "3400936282421", - "3400936668287", - "3400934304163", - "3400927607615", - "3400934415784", - "3400934475696", - "3400922491233", - "3400922490281", - "3400927417818", - "3400939322186", - "3400937530941", - "3400941902048", - "3400949026272", - "3400949014156", - "3400926661632", - "3400922489452", - "3400938407525", - "3400921634792", - "3400941783159", - "3400930060223", - "3400949000838", - "3400930011539", - "3400934097126", - "3400922296173", - "3400949769278", - "3400949769049", - "3400949767908", - "3400949498246", - "3400949498017", - "3400949762415", - "3400949000494", - "3400927357572", - "3400922491691", - "3400922491462", - "3400922490403", - "3400927824326", - "3400927823435", - "3400949721771", - "3400939257679", - "3400939257440", - "3400939541488", - "3400939540948", - "3400939656953", - "3400939656724", - "3400949761463", - "3400938778151", - "3400938776140", - "3400938993660", - "3400935941879", - "3400937828369", - "3400937828598", - "3400927357343", - "3400927352898", - "3400926766870", - "3400926766351", - "3400927607325", - "3400927481512", - "3400939389882", - "3400937828079", - "3400939075167", - "3400939080710", - "3400941785511", - "3400922036298", - "3400927357404", - "3400936482852", - "3400927822544", - "3400926766580", - "3400922033976", - "3400926822163", - "3400926815998", - "3400927818004", - "3400927817922", - "3400927822483", - "3400936482913", - "3400949721023", - "3400949720880", - "3400949720651", - "3400949719242", - "3400949721313", - "3400949718702", - "3400927417986", - "3400938785074", - "3400949761173", - "3400949717811", - "3400949717699", - "3400949765256", - "3400949765027", - "3400927821714", - "3400927821653", - "3400927800481", - "3400927568190", - "3400949718580", - "3400939388472", - "3400939388243", - "3400936323216", - "3400934476068", - "3400938408416", - "3400939816074", - "3400949763535", - "3400949760053", - "3400937827829", - "3400937827768", - "3400938116731", - "3400938116502", - "3400938113600", - "3400927481451", - "3400927481390", - "3400927418129", - "3400927418068", - "3400927417757", - "3400927933295", - "3400927933127", - "3400939815763", - "3400938113259", - "3400938113020", - "3400939079998", - "3400921609370", - "3400938175172", - "3400949339648", - "3400936324336", - "3400921636574", - "3400921628999", - "3400949341948", - "3400941657320", - "3400941896613", - "3400936806627", - "3400921607949", - "3400941895371", - "3400921793390", - "3400927932175", - "3400936290235", - "3400921851335", - "3400936484115", - "3400927930973", - "3400936673830", - "3400921792560", - "3400936281240", - "3400934615528", - "3400936325975", - "3400949766727", - "3400949766437", - "3400936672949", - "3400949011773", - "3400949022311", - "3400949022250", - "3400949341887", - "3400949341719", - "3400949341368", - "3400949341078", - "3400941904400", - "3400949016686", - "3400949016457", - "3400949623228", - "3400941659041", - "3400941658969", - "3400941895661", - "3400941894312", - "3400941893131", - "3400941892998", - "3400941892820", - "3400949338986", - "3400949338818", - "3400941658730", - "3400941657498", - "3400949943456", - "3400949021598", - "3400949010073", - "3400921636284", - "3400949943395", - "3400930000755", - "3400927873690", - "3400949026562", - "3400949026333", - "3400949015047", - "3400949014965", - "3400949014675", - "3400921627701", - "3400921627589", - "3400949011605", - "3400949948079", - "3400921849035", - "3400921609660", - "3400927368110", - "3400934455827", - "3400935583987", - "3400927930805", - "3400939257389", - "3400939307220", - "3400938174342", - "3400921611502", - "3400949948130", - "3400926802462", - "3400936281189", - "3400926814410", - "3400935584069", - "3400936480551", - "3400949948369", - "3400949339709", - "3400939146348", - "3400949022021", - "3400939154503", - "3400939151250", - "3400936324626", - "3400922037301", - "3400921790559", - "3400939655482", - "3400936325227", - "3400927884467", - "3400949014736", - "3400949016518", - "3400949947997", - "3400938105155", - "3400938140224", - "3400922471761", - "3400938111828", - "3400936325807", - "3400936325395", - "3400936325166", - "3400937974769", - "3400938268416", - "3400936480612", - "3400936583610", - "3400936677043", - "3400936674080", - "3400936673540", - "3400938140453", - "3400936484283", - "3400936673199", - "3400936673021", - "3400936672710", - "3400936388611", - "3400936388550", - "3400936480490", - "3400936672659", - "3400936668577", - "3400938174922", - "3400938174052", - "3400936668409", - "3400936668348", - "3400939080888", - "3400939389592", - "3400933839932", - "3400939657325", - "3400939151779", - "3400939657264", - "3400939657035", - "3400939154732", - "3400939154442", - "3400939151601", - "3400936356153", - "3400939151489", - "3400939321707", - "3400939145228", - "3400939307398", - "3400938119282", - "3400938119114", - "3400939146287", - "3400939080130", - "3400938119053", - "3400939154213", - "3400939154091", - "3400934095924", - "3400935533661", - "3400935533142", - "3400934303913", - "3400935533081", - "3400936280939", - "3400936290983", - "3400936290006", - "3400936034372", - "3400927933417", - "3400922390826", - "3400921623796", - "3400921793222", - "3400921901405", - "3400936323674", - "3400921640137", - "3400921639995", - "3400921638707", - "3400921633443", - "3400921633382", - "3400921604818", - "3400926802004", - "3400926800802", - "3400926813529", - "3400926813178", - "3400921626698", - "3400921626469", - "3400926664992", - "3400926662813", - "3400927931406", - "3400927884238", - "3400927884009", - "3400921848953", - "3400921780093", - "3400922471532", - "3400921638646", - "3400921637694", - "3400949948420", - "3400927930515", - "3400927930393", - "3400927871689", - "3400921851274", - "3400921779493", - "3400921779325", - "3400921623567", - "3400921611380", - "3400921608199", - "3400936583498", - "3400936672888", - "3400922296234", - "3400927800832", - "3400921607888", - "3400921622447", - "3400921606010", - "3400922392196", - "3400922393438", - "3400922393087", - "3400921632552", - "3400922000510", - "3400922000398", - "3400921906837", - "3400936324565", - "3400921632323", - "3400921630251", - "3400921629941", - "3400927932236", - "3400927920639", - "3400930002896", - "3400930000724", - "3400930003039", - "3400922470702", - "3400930015384", - "3400934475528", - "3400935533722", - "3400938268584", - "3400938103953", - "3400936673311", - "3400941610424", - "3400941788413", - "3400949117963", - "3400936483743", - "3400936673601", - "3400938785654", - "3400921928020", - "3400922036359", - "3400936291645", - "3400949004621", - "3400935514561", - "3400930027288", - "3400949946587", - "3400949116324", - "3400941888458", - "3400936887299", - "3400936805385", - "3400949337347", - "3400941785689", - "3400938105445", - "3400936805736", - "3400941609824", - "3400921623338", - "3400949336517", - "3400939524450", - "3400936290815", - "3400936287853", - "3400938108576", - "3400922033747", - "3400949763993", - "3400949721191", - "3400939255729", - "3400939239521", - "3400936290464", - "3400926951535", - "3400949700202", - "3400949755141", - "3400938673081", - "3400939072845", - "3400949007691", - "3400936804845", - "3400938954340", - "3400936862685", - "3400936281530", - "3400936286214", - "3400941906299", - "3400949118793", - "3400926822453", - "3400936674370", - "3400937991568", - "3400934707230", - "3400934096754", - "3400934606182", - "3400938118223", - "3400937531191", - "3400934052460", - "3400939320694", - "3400939239811", - "3400939656892", - "3400921928310", - "3400926765750", - "3400949720712", - "3400949719013", - "3400936323155", - "3400939815824", - "3400938139853", - "3400938109757", - "3400941888229", - "3400927824494", - "3400949001323", - "3400949001286", - "3400936674202", - "3400936357624", - "3400936357334", - "3400936357273", - "3400939074276", - "3400930066423", - "3400941859533", - "3400926814700", - "3400937827539", - "3400949498994", - "3400949721832", - "3400938782004", - "3400922037769", - "3400935533890", - "3400935584878", - "3400938784305", - "3400922491752", - "3400927846847", - "3400922034348", - "3400930070567", - "3400936871380", - "3400927932526", - "3400922197487", - "3400949946938", - "3400927931864", - "3400927870910", - "3400934410642", - "3400949940264", - "3400938117912", - "3400938186758", - "3400949001644", - "3400921854466", - "3400927870859", - "3400949768967", - "3400949495863", - "3400938993899", - "3400937828130", - "3400922035987", - "3400934096006", - "3400927817861", - "3400927821592", - "3400926955847", - "3400930060728", - "3400936886810", - "3400927352959", - "3400926954376", - "3400949717750", - "3400949763764", - "3400938117042", - "3400927607554", - "3400934303333", - "3400921995664", - "3400921994025", - "3400938113549", - "3400934475818", - "3400934098017", - "3400927568251", - "3400922197777", - "3400934963261", - "3400922146850", - "3400926817190", - "3400927366338", - "3400927481680", - "3400927418419", - "3400938782172", - "3400938105094", - "3400949758333", - "3400949947010", - "3400938119862", - "3400937827300", - "3400921837896", - "3400921836547", - "3400930060216", - "3400922465678", - "3400926666026", - "3400927801082", - "3400930027301", - "3400935584120", - "3400930060711", - "3400934548420", - "3400921605877", - "3400949009763", - "3400938991420", - "3400922470870", - "3400949026104", - "3400949021949", - "3400939152080", - "3400941906060", - "3400949022199", - "3400939655543", - "3400934417566", - "3400941896842", - "3400938113488", - "3400949341139", - "3400941895203", - "3400937974820", - "3400935973467", - "3400934601910", - "3400921628760", - "3400927932984", - "3400939307459", - "3400921604757", - "3400941904578", - "3400938174113", - "3400936677104", - "3400936673489", - "3400936325746", - "3400936484344", - "3400936668638", - "3400941894190", - "3400936668058", - "3400936290693", - "3400936290174", - "3400939146577", - "3400936803954", - "3400949943517", - "3400939144917", - "3400939079820", - "3400938194043", - "3400935532831", - "3400936388499", - "3400927933585", - "3400934417337", - "3400936674141", - "3400922389585", - "3400927368578", - "3400927800542", - "3400927801143", - "3400921792621", - "3400936356443", - "3400936356214", - "3400936323964", - "3400936323735", - "3400936583559", - "3400921779905", - "3400927932816", - "3400921622218", - "3400921638936", - "3400921605068", - "3400926801113", - "3400934096983", - "3400927931574", - "3400927883927", - "3400921637816", - "3400927930225", - "3400922393148", - "3400921630022", - "3400922392028", - "3400922391137", - "3400921627411" - ) - - val ipp = new PharmacologicalClassConfig( - name = "IPP", - ATCCodes = List("A02BC*") - ) - override val pharmacologicalClasses: List[PharmacologicalClassConfig] = List(ipp) -} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/DcirMedicalActExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/DcirMedicalActExtractor.scala new file mode 100644 index 00000000..7e2bd4f0 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/DcirMedicalActExtractor.scala @@ -0,0 +1,104 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.acts + +import java.sql.Timestamp +import scala.util.Try +import org.apache.spark.sql.Row +import fr.polytechnique.cmap.cnam.etl.events.{BiologyDcirAct, DcirAct, EventBuilder, MedicalAct} +import fr.polytechnique.cmap.cnam.etl.extractors.StartsWithStrategy +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.dcir.DcirSimpleExtractor +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +/** + * Gets all type of Acts from DCIR. + * + * The main addition of this class is the groupId method that allows to get the + * source of the act: Liberal, PublicAmbulatory, PrivateAmbulatory, Unkown and when + * the information is not available a default DCIRAct. + * @param codes: List of Act codes to be tracked in the study or empty to get all the Acts. + */ +abstract sealed class DcirRowActExtractor(codes: SimpleExtractorCodes) extends DcirSimpleExtractor[MedicalAct] + with StartsWithStrategy[MedicalAct] { + + final val PrivateInstitutionCodes = Set(4D, 5D, 6D, 7D) + + // Implementation of the Extractor Trait + override def getCodes: SimpleExtractorCodes = codes + + // Implementation of the EventRowExtractor + override def usedColumns: List[String] = List(ColNames.InstitutionCode, ColNames.GHSCode, ColNames.Sector) ++ super + .usedColumns + + override def extractStart(r: Row): Timestamp = { + Try(super.extractStart(r)) recover { + case _ => makeTS(1970, 1, 1) + } + }.get + + /** + * Get the information of the origin of DCIR act that is being extracted. It returns a + * Failure[IllegalArgumentException] if the DCIR schema is old, a success if the DCIR schema contains an information. + * + * @param r the row of DCIR to be investigated. + * @return Try[String] + */ + // TODO: REMOVE THIS + override def extractGroupId(r: Row): String = { + Try { + + if (!r.isNullAt(r.fieldIndex(ColNames.Sector)) && getSector(r) == 1) { + DcirAct.groupID.PublicAmbulatory + } + else { + if (r.isNullAt(r.fieldIndex(ColNames.GHSCode))) { + DcirAct.groupID.Liberal + } else { + // Value is not at null, it is not liberal + lazy val ghs = getGHS(r) + lazy val institutionCode = getInstitutionCode(r) + // Check if it is a private ambulatory + if (ghs == 0 && PrivateInstitutionCodes.contains(institutionCode)) { + DcirAct.groupID.PrivateAmbulatory + } + else { + DcirAct.groupID.Unknown + } + } + } + } recover { case _: IllegalArgumentException => DcirAct.groupID.DcirAct } + }.get + + private def getGHS(r: Row): Double = r.getAs[Double](ColNames.GHSCode) + + private def getInstitutionCode(r: Row): Double = r.getAs[Double](ColNames.InstitutionCode) + + private def getSector(r: Row): Double = r.getAs[Double](ColNames.Sector) +} + +/** + * Get the CCAM coded acts from the DCIR. + * @param codes: List of Act codes to be tracked in the study or empty to get all the Acts. + */ +final case class DcirMedicalActExtractor(codes: SimpleExtractorCodes) + extends DcirRowActExtractor(codes) { + // Implementation of the BasicExtractor Trait + override val columnName: String = ColNames.CamCode + override val eventBuilder: EventBuilder = DcirAct +} + +/** + * Get the biology acts from the DCIR. + * @param codes: List of Act codes to be tracked in the study or empty to get all the Acts. + */ +final case class DcirBiologyActExtractor(codes: SimpleExtractorCodes) + extends DcirRowActExtractor(codes) { + // Implementation of the BasicExtractor Trait + override val columnName: String = ColNames.BioCode + override val eventBuilder: EventBuilder = BiologyDcirAct + + // Because BioCode is a Double + override def extractValue(row: Row): String = row.getAs[Double](columnName).toString + +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/HadCcamActExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/HadCcamActExtractor.scala new file mode 100644 index 00000000..65c2baf7 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/HadCcamActExtractor.scala @@ -0,0 +1,15 @@ +package fr.polytechnique.cmap.cnam.etl.extractors.events.acts + +import fr.polytechnique.cmap.cnam.etl.events.{EventBuilder, HadCCAMAct, MedicalAct} +import fr.polytechnique.cmap.cnam.etl.extractors.StartsWithStrategy +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.had.HadSimpleExtractor + +final case class HadCcamActExtractor(codes: SimpleExtractorCodes) extends HadSimpleExtractor[MedicalAct] + with StartsWithStrategy[MedicalAct] { + override val columnName: String = ColNames.CCAM + override val eventBuilder: EventBuilder = HadCCAMAct + override def getCodes: SimpleExtractorCodes = codes +} + + diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/McoCcamActExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/McoCcamActExtractor.scala new file mode 100644 index 00000000..47f15e07 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/McoCcamActExtractor.scala @@ -0,0 +1,32 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.acts + +import java.sql.Timestamp +import me.danielpes.spark.datetime.Period +import me.danielpes.spark.datetime.implicits.DateImplicits +import org.apache.spark.sql.Row +import fr.polytechnique.cmap.cnam.etl.events.{EventBuilder, McoCCAMAct, MedicalAct} +import fr.polytechnique.cmap.cnam.etl.extractors.StartsWithStrategy +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.mco.McoSimpleExtractor + + +final case class McoCcamActExtractor(codes: SimpleExtractorCodes) extends McoSimpleExtractor[MedicalAct] + with StartsWithStrategy[MedicalAct] { + override val columnName: String = ColNames.CCAM + override val eventBuilder: EventBuilder = McoCCAMAct + + override def usedColumns: List[String] = ColNames.CCAMDelayDate :: super.usedColumns + + override def getCodes: SimpleExtractorCodes = codes + + override def extractStart(r: Row): Timestamp = { + (r.getAs[Timestamp](NewColumns.EstimatedStayStart) + Period(days = getDateOffset(r))).get + } + + def getDateOffset(r: Row): Int = r.getAs[String](ColNames.CCAMDelayDate) match { + case null => 0 + case value: String => value.toInt + } +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/McoCeCcamActExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/McoCeCcamActExtractor.scala new file mode 100644 index 00000000..8d893ed6 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/McoCeCcamActExtractor.scala @@ -0,0 +1,17 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.acts + + +import fr.polytechnique.cmap.cnam.etl.events.{EventBuilder, McoCeCcamAct, MedicalAct} +import fr.polytechnique.cmap.cnam.etl.extractors.StartsWithStrategy +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.mcoce.McoCeSimpleExtractor + +final case class McoCeCcamActExtractor(codes: SimpleExtractorCodes) extends McoCeSimpleExtractor[MedicalAct] + with StartsWithStrategy[MedicalAct] { + override val eventBuilder: EventBuilder = McoCeCcamAct + override val columnName: String = ColNames.CamCode + + override def getCodes: SimpleExtractorCodes = codes +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/MedicalActsConfig.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/MedicalActsConfig.scala similarity index 91% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/MedicalActsConfig.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/MedicalActsConfig.scala index 5c1d6758..79c230de 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/MedicalActsConfig.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/MedicalActsConfig.scala @@ -1,6 +1,6 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.acts +package fr.polytechnique.cmap.cnam.etl.extractors.events.acts import fr.polytechnique.cmap.cnam.etl.extractors.ExtractorConfig @@ -19,7 +19,7 @@ class MedicalActsConfig( val ssrCECodes: List[String], val ssrCSARRCodes: List[String], val hadCCAMCodes: List[String] - ) extends ExtractorConfig +) extends ExtractorConfig object MedicalActsConfig { @@ -41,6 +41,7 @@ object MedicalActsConfig { ssrCSARRCodes, ssrCCAMCodes, ssrCECodes, - hadCCAMCodes) + hadCCAMCodes + ) } } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/SsrActExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/SsrActExtractor.scala new file mode 100644 index 00000000..733c1942 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/SsrActExtractor.scala @@ -0,0 +1,34 @@ +package fr.polytechnique.cmap.cnam.etl.extractors.events.acts + +import fr.polytechnique.cmap.cnam.etl.events.{EventBuilder, MedicalAct, SsrCCAMAct, SsrCSARRAct} +import fr.polytechnique.cmap.cnam.etl.extractors.StartsWithStrategy +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.ssr.SsrSimpleExtractor + +final case class SsrCcamActExtractor(codes: SimpleExtractorCodes) extends SsrSimpleExtractor[MedicalAct] with + StartsWithStrategy[MedicalAct] { + override val columnName: String = ColNames.CCAM + override val eventBuilder: EventBuilder = SsrCCAMAct + + override def getCodes: SimpleExtractorCodes = codes +} + +/** Extract Csarr codes : + * + * The Specific Catalogue of Acts of Rehabilitation and Rehabilitation (CSARR) is intended to + * describe and code the activity of the professionals concerned in follow-up care and + * rehabilitation establishments (SSR). These acts are to be distinguished from CCAM acts which + * are the sole responsibility of the doctor. + * + * This terminology is of the form `AAA+111`, eg. *GKQ+139 : Évaluation initiale du langage écrit* + * + * The complete terminology can be found here : https://drees.shinyapps.io/dico-snds/?variable=FP_PEC&search=csar&table=T_SSRaa_nnB + * For more details see : https://www.atih.sante.fr/sites/default/files/public/content/3302/csarr_2018.pdf + */ +final case class SsrCsarrActExtractor(codes: SimpleExtractorCodes) extends SsrSimpleExtractor[MedicalAct] with + StartsWithStrategy[MedicalAct] { + override val columnName: String = ColNames.CSARR + override val eventBuilder: EventBuilder = SsrCSARRAct + + override def getCodes: SimpleExtractorCodes = codes +} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/SsrCeActExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/SsrCeActExtractor.scala new file mode 100644 index 00000000..55b768a0 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/SsrCeActExtractor.scala @@ -0,0 +1,15 @@ +package fr.polytechnique.cmap.cnam.etl.extractors.events.acts + +import fr.polytechnique.cmap.cnam.etl.events.{EventBuilder, MedicalAct, SsrCEAct} +import fr.polytechnique.cmap.cnam.etl.extractors.StartsWithStrategy +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.ssrce.SsrCeSimpleExtractor + +final case class SsrCeActExtractor(codes: SimpleExtractorCodes) extends SsrCeSimpleExtractor[MedicalAct] + with StartsWithStrategy[MedicalAct] { + override def columnName: String = ColNames.CamCode + + override def eventBuilder: EventBuilder = SsrCEAct + + override def getCodes: SimpleExtractorCodes = codes +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/classifications/GhmExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/classifications/GhmExtractor.scala new file mode 100644 index 00000000..93ab9190 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/classifications/GhmExtractor.scala @@ -0,0 +1,16 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.classifications + +import fr.polytechnique.cmap.cnam.etl.events.{Classification, EventBuilder, GHMClassification} +import fr.polytechnique.cmap.cnam.etl.extractors.StartsWithStrategy +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.mco.McoSimpleExtractor + +final case class GhmExtractor(codes: SimpleExtractorCodes) extends McoSimpleExtractor[Classification] + with StartsWithStrategy[Classification] { + override val columnName: String = ColNames.GHM + override val eventBuilder: EventBuilder = GHMClassification + + override def getCodes: SimpleExtractorCodes = codes +} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/DiagnosesConfig.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/DiagnosesConfig.scala similarity index 92% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/DiagnosesConfig.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/DiagnosesConfig.scala index a10c8cfa..d65ddac4 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/DiagnosesConfig.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/DiagnosesConfig.scala @@ -1,6 +1,6 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.diagnoses +package fr.polytechnique.cmap.cnam.etl.extractors.events.diagnoses import fr.polytechnique.cmap.cnam.etl.extractors.ExtractorConfig diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/HadDiagnosisExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/HadDiagnosisExtractor.scala new file mode 100644 index 00000000..6c7ab966 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/HadDiagnosisExtractor.scala @@ -0,0 +1,24 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.diagnoses + +import fr.polytechnique.cmap.cnam.etl.events.{Diagnosis, EventBuilder, HadAssociatedDiagnosis, HadMainDiagnosis} +import fr.polytechnique.cmap.cnam.etl.extractors.StartsWithStrategy +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.had.HadSimpleExtractor + +final case class HadMainDiagnosisExtractor(codes: SimpleExtractorCodes) extends HadSimpleExtractor[Diagnosis] with + StartsWithStrategy[Diagnosis] { + override val columnName: String = ColNames.DP + override val eventBuilder: EventBuilder = HadMainDiagnosis + + override def getCodes: SimpleExtractorCodes = codes +} + +final case class HadAssociatedDiagnosisExtractor(codes: SimpleExtractorCodes) extends HadSimpleExtractor[Diagnosis] with + StartsWithStrategy[Diagnosis] { + override val columnName: String = ColNames.DA + override val eventBuilder: EventBuilder = HadAssociatedDiagnosis + + override def getCodes: SimpleExtractorCodes = codes +} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/ImbCimDiagnosisExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/ImbCimDiagnosisExtractor.scala new file mode 100644 index 00000000..f5900142 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/ImbCimDiagnosisExtractor.scala @@ -0,0 +1,33 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.diagnoses + +import org.apache.spark.sql.{DataFrame, Row} +import fr.polytechnique.cmap.cnam.etl.events.{Diagnosis, EventBuilder, ImbCcamDiagnosis} +import fr.polytechnique.cmap.cnam.etl.extractors.IsInStrategy +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.imb.ImbSimpleExtractor +import fr.polytechnique.cmap.cnam.etl.sources.Sources + + +final case class ImbCimDiagnosisExtractor(codes: SimpleExtractorCodes) extends ImbSimpleExtractor[Diagnosis] + with IsInStrategy[Diagnosis] { + + override def isInExtractorScope(row: Row): Boolean = { + lazy val idx = row.fieldIndex(ColNames.Code) + extractEncoding(row) == "CIM10" || !row.isNullAt(idx) + } + + override def isInStudy(row: Row): Boolean = codes.exists(extractValue(row).startsWith(_)) + + override def getInput(sources: Sources): DataFrame = sources.irImb.get + + override def columnName: String = ColNames.Code + + override def eventBuilder: EventBuilder = ImbCcamDiagnosis + + override def neededColumns: List[String] = + List(ColNames.PatientID, ColNames.Date, ColNames.Encoding, ColNames.Code, ColNames.EndDate) + + override def getCodes: SimpleExtractorCodes = codes +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/McoDiagnosisExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/McoDiagnosisExtractor.scala new file mode 100644 index 00000000..d12b614b --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/McoDiagnosisExtractor.scala @@ -0,0 +1,31 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.diagnoses + +import fr.polytechnique.cmap.cnam.etl.events._ +import fr.polytechnique.cmap.cnam.etl.extractors.StartsWithStrategy +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.mco.McoSimpleExtractor + +protected trait SimpleMcoDiagnosisExtractor extends McoSimpleExtractor[Diagnosis] with StartsWithStrategy[Diagnosis] + +case class McoMainDiagnosisExtractor(codes: SimpleExtractorCodes) extends SimpleMcoDiagnosisExtractor { + override val columnName: String = ColNames.DP + override val eventBuilder: EventBuilder = McoMainDiagnosis + + override def getCodes: SimpleExtractorCodes = codes +} + +case class McoAssociatedDiagnosisExtractor(codes: SimpleExtractorCodes) extends SimpleMcoDiagnosisExtractor { + override val columnName: String = ColNames.DA + override val eventBuilder: EventBuilder = McoAssociatedDiagnosis + + override def getCodes: SimpleExtractorCodes = codes +} + +case class McoLinkedDiagnosisExtractor(codes: SimpleExtractorCodes) extends SimpleMcoDiagnosisExtractor { + override val columnName: String = ColNames.DR + override val eventBuilder: EventBuilder = McoLinkedDiagnosis + + override def getCodes: SimpleExtractorCodes = codes +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/SsrDiagnosisExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/SsrDiagnosisExtractor.scala new file mode 100644 index 00000000..54e1b371 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/SsrDiagnosisExtractor.scala @@ -0,0 +1,33 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.diagnoses + +import fr.polytechnique.cmap.cnam.etl.events._ +import fr.polytechnique.cmap.cnam.etl.extractors.StartsWithStrategy +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.ssr.SsrSimpleExtractor + +protected sealed abstract class SsrDiagnosisExtractor(codes: SimpleExtractorCodes) extends SsrSimpleExtractor[Diagnosis] with + StartsWithStrategy[Diagnosis] { + override def getCodes: SimpleExtractorCodes = codes +} + +final case class SsrMainDiagnosisExtractor(codes: SimpleExtractorCodes) extends SsrDiagnosisExtractor(codes) { + override val columnName: String = ColNames.DP + override val eventBuilder: EventBuilder = SsrMainDiagnosis +} + +final case class SsrAssociatedDiagnosisExtractor(codes: SimpleExtractorCodes) extends SsrDiagnosisExtractor(codes) { + override val columnName: String = ColNames.DA + override val eventBuilder: EventBuilder = SsrAssociatedDiagnosis +} + +final case class SsrLinkedDiagnosisExtractor(codes: SimpleExtractorCodes) extends SsrDiagnosisExtractor(codes) { + override val columnName: String = ColNames.DR + override val eventBuilder: EventBuilder = SsrLinkedDiagnosis +} + +final case class SsrTakingOverPurposeExtractor(codes: SimpleExtractorCodes) extends SsrDiagnosisExtractor(codes) { + override val columnName: String = ColNames.FP_PEC + override val eventBuilder: EventBuilder = SsrTakingOverPurpose +} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/DrugConfig.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/DrugConfig.scala new file mode 100644 index 00000000..3fe1bd20 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/DrugConfig.scala @@ -0,0 +1,20 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs + +import fr.polytechnique.cmap.cnam.etl.extractors.ExtractorConfig +import fr.polytechnique.cmap.cnam.etl.extractors.codes.ExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.DrugClassConfig +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level.DrugClassificationLevel + +class DrugConfig( + val level: DrugClassificationLevel, + val families: List[DrugClassConfig]) extends ExtractorConfig with ExtractorCodes { + override def isEmpty: Boolean = families.isEmpty +} + +object DrugConfig { + def apply(level: DrugClassificationLevel, families: List[DrugClassConfig]): DrugConfig = new DrugConfig( + level, families + ) +} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/DrugExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/DrugExtractor.scala similarity index 59% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/DrugExtractor.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/DrugExtractor.scala index a63a3b74..682fe299 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/DrugExtractor.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/DrugExtractor.scala @@ -1,60 +1,21 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.drugs +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs import java.sql.Timestamp -import scala.reflect.runtime.universe import org.apache.commons.codec.binary.Base64 -import org.apache.spark.sql._ import org.apache.spark.sql.functions.{col, when} import org.apache.spark.sql.types.{StringType, TimestampType} +import org.apache.spark.sql.{Column, DataFrame, Row} import fr.polytechnique.cmap.cnam.etl.events.{Drug, Event} import fr.polytechnique.cmap.cnam.etl.extractors.Extractor import fr.polytechnique.cmap.cnam.etl.sources.Sources -class DrugExtractor(drugConfig: DrugConfig) extends Extractor[Drug] { +class DrugExtractor(drugConfig: DrugConfig) extends Extractor[Drug, DrugConfig] { - override def extract( - sources: Sources, - codes: Set[String]) - (implicit ctag: universe.TypeTag[Drug]): Dataset[Event[Drug]] = { + override def getCodes: DrugConfig = drugConfig - val input: DataFrame = getInput(sources) - - import input.sqlContext.implicits._ - - { - if (drugConfig.families.isEmpty) { - input.filter(isInExtractorScope _) - } - else { - input.filter(isInExtractorScope _).filter(isInStudy(codes) _) - } - }.flatMap(builder _).distinct() - - } - - - - def extractGroupId(r: Row): String = { - Base64.encodeBase64(s"${r.getAs[String](ColNames.FluxDate)}_${r.getAs[String](ColNames.FluxProcessingDate)}_${ - r.getAs[String]( - ColNames - .EmitterType - ) - }_${r.getAs[String](ColNames.EmitterId)}_${r.getAs[String](ColNames.FluxSeqNumber)}_${ - r.getAs[String]( - ColNames - .OrganisationOldId - ) - }_${r.getAs[String](ColNames.OrganisationDecompteNumber)}".getBytes()).map(_.toChar).mkString - - - } - - - override def isInStudy(codes: Set[String]) - (row: Row): Boolean = drugConfig.level.isInFamily(drugConfig.families, row) + override def isInStudy(row: Row): Boolean = drugConfig.level.isInFamily(drugConfig.families, row) override def isInExtractorScope(row: Row): Boolean = true @@ -64,8 +25,9 @@ class DrugExtractor(drugConfig: DrugConfig) extends Extractor[Drug] { lazy val patientID = getPatientID(row) lazy val conditioning = getConditioning(row) lazy val date = getEventDate(row) + lazy val groupID = extractGroupId(row) - classification.map(code => Drug(patientID, code, conditioning, date)) + classification.map(code => Drug(patientID, code, conditioning, groupID, date)) } private def getPatientID(row: Row): String = row.getAs[String](ColNames.PatientId) @@ -74,7 +36,32 @@ class DrugExtractor(drugConfig: DrugConfig) extends Extractor[Drug] { private def getEventDate(row: Row): Timestamp = row.getAs[Timestamp](ColNames.Date) + /** It generate a hash using the values of these columns + * (FLX_DIS_DTD,FLX_TRT_DTD,FLX_EMT_TYP,FLX_EMT_NUM,FLX_EMT_ORD,ORG_CLE_NUM,DCT_ORD_NUM). + * It allows to identify each prescription in a unique way, it can be used to identify + * the possible interactions of molecules prescript in the same period. + * + * @param r The Row object itself. + * @return A hash Id unique in a string format. + */ + def extractGroupId(r: Row): String = { + Base64.encodeBase64( + s"${r.getAs[String](ColNames.FluxDate)}_${r.getAs[String](ColNames.FluxProcessingDate)}_${ + r.getAs[String]( + ColNames + .EmitterType + ) + }_${r.getAs[String](ColNames.EmitterId)}_${r.getAs[String](ColNames.FluxSeqNumber)}_${ + r.getAs[String]( + ColNames + .OrganisationOldId + ) + }_${r.getAs[String](ColNames.OrganisationDecompteNumber)}".getBytes() + ).map(_.toChar).mkString + } + override def getInput(sources: Sources): DataFrame = { + val neededColumns: List[Column] = List( col("NUM_ENQ").cast(StringType).as("patientID"), col("ER_PHA_F__PHA_PRS_C13").cast(StringType).as("CIP13"), @@ -82,12 +69,10 @@ class DrugExtractor(drugConfig: DrugConfig) extends Extractor[Drug] { col("EXE_SOI_DTD").cast(TimestampType).as("eventDate"), col("molecule_combination").cast(StringType).as("molecules"), col("PHA_CND_TOP").cast(StringType).as("conditioning") - ) + ) ::: ColNames.GroupID.map(col) lazy val irPhaR = sources.irPha.get lazy val dcir = sources.dcir.get - val spark: SparkSession = dcir.sparkSession - lazy val df: DataFrame = dcir.join(irPhaR, dcir.col("ER_PHA_F__PHA_PRS_C13") === irPhaR.col("PHA_CIP_C13")) df @@ -96,12 +81,7 @@ class DrugExtractor(drugConfig: DrugConfig) extends Extractor[Drug] { .na.drop(Seq("eventDate", "CIP13", "ATC5")) } - final object ColNames { - val PatientId = "patientID" - val Conditioning = "conditioning" - val Date = "eventDate" - val Cip13 = "CIP13" - + final object ColNames extends Serializable { lazy val FluxDate = "FLX_DIS_DTD" lazy val FluxProcessingDate = "FLX_TRT_DTD" lazy val EmitterType = "FLX_EMT_TYP" @@ -109,6 +89,17 @@ class DrugExtractor(drugConfig: DrugConfig) extends Extractor[Drug] { lazy val FluxSeqNumber = "FLX_EMT_ORD" lazy val OrganisationOldId = "ORG_CLE_NUM" lazy val OrganisationDecompteNumber = "DCT_ORD_NUM" + lazy val GroupID = List( + FluxDate, FluxProcessingDate, EmitterType, EmitterId, FluxSeqNumber, OrganisationOldId, OrganisationDecompteNumber + ) + val PatientId = "patientID" + val Conditioning = "conditioning" + val Date = "eventDate" + val Cip13 = "CIP13" } } + +object DrugExtractor { + def apply(drugConfig: DrugConfig): DrugExtractor = new DrugExtractor(drugConfig) +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/DrugClassConfig.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/DrugClassConfig.scala similarity index 73% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/DrugClassConfig.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/DrugClassConfig.scala index 75e0aee0..27d37ede 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/DrugClassConfig.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/DrugClassConfig.scala @@ -1,6 +1,6 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification import java.io.Serializable diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/PharmacologicalClassConfig.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/PharmacologicalClassConfig.scala similarity index 92% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/PharmacologicalClassConfig.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/PharmacologicalClassConfig.scala index 71710a7d..7c04685a 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/PharmacologicalClassConfig.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/PharmacologicalClassConfig.scala @@ -1,6 +1,6 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification class PharmacologicalClassConfig( val name: String, diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/Antidepresseurs.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/Antidepresseurs.scala similarity index 98% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/Antidepresseurs.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/Antidepresseurs.scala index 1f48261c..e293d5aa 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/Antidepresseurs.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/Antidepresseurs.scala @@ -1,8 +1,8 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.families +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.families -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} object Antidepresseurs extends DrugClassConfig { diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/Antiepileptics.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/Antiepileptics.scala similarity index 98% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/Antiepileptics.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/Antiepileptics.scala index 8efa5d44..65fa4673 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/Antiepileptics.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/Antiepileptics.scala @@ -1,8 +1,8 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.families +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.families -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} object Antiepileptics extends DrugClassConfig { override val name: String = "Antiepileptics" diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/Antihypertenseurs.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/Antihypertenseurs.scala similarity index 99% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/Antihypertenseurs.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/Antihypertenseurs.scala index a432c9df..40362257 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/Antihypertenseurs.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/Antihypertenseurs.scala @@ -1,8 +1,8 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.families +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.families -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} object Antihypertenseurs extends DrugClassConfig { diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/Cardiac.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/Cardiac.scala similarity index 50% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/Cardiac.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/Cardiac.scala index 23968369..c9e5639c 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/Cardiac.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/Cardiac.scala @@ -1,25 +1,25 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.families +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.families -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} object Cardiac extends DrugClassConfig { override val name: String = "CardiacTherapy" override val cip13Codes: Set[String] = Set( - "3400933489045", - "3400930313374", - "3400930313206", - "3400930193945", - "3400931163411", - "3400932346554", - "3400933466091", - "3400930313893" + "3400933489045", + "3400930313374", + "3400930313206", + "3400930193945", + "3400931163411", + "3400932346554", + "3400933466091", + "3400930313893" ) - val cardiacGlycosides = new PharmacologicalClassConfig( name = "CardiacGlycosides", ATCCodes = List("C01AA*") ) override val pharmacologicalClasses: List[PharmacologicalClassConfig] = List(cardiacGlycosides) + } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/Hypnotiques.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/Hypnotiques.scala similarity index 97% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/Hypnotiques.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/Hypnotiques.scala index aeb60d13..e713b47c 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/Hypnotiques.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/Hypnotiques.scala @@ -1,8 +1,8 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.families +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.families -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} object Hypnotiques extends DrugClassConfig { diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/Neuroleptiques.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/Neuroleptiques.scala similarity index 98% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/Neuroleptiques.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/Neuroleptiques.scala index 1fa2a503..1f1892ae 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/classification/families/Neuroleptiques.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/Neuroleptiques.scala @@ -1,8 +1,8 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.families +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.families -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} object Neuroleptiques extends DrugClassConfig { diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/Opioids.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/Opioids.scala new file mode 100644 index 00000000..2957dc56 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/Opioids.scala @@ -0,0 +1,480 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.families + +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} + +object Opioids extends DrugClassConfig { + override val name: String = "Opioids" + override val cip13Codes: Set[String] = Set( + "3400938747393", + "3400939023649", + "3400935770073", + "3400936114548", + "3400936114487", + "3400934828096", + "3400937479462", + "3400937489928", + "3400937484145", + "3400934992308", + "3400934990007", + "3400934951374", + "3400930378540", + "3400933715014", + "3400935641281", + "3400930411469", + "3400927319143", + "3400936853690", + "3400934314797", + "3400934314568", + "3400934314278", + "3400934808166", + "3400934802362", + "3400936102132", + "3400926845926", + "3400938229783", + "3400936853812", + "3400935793195", + "3400926980801", + "3400926797348", + "3400936319714", + "3400927671029", + "3400939104874", + "3400935877154", + "3400935106704", + "3400934538193", + "3400939221601", + "3400939221540", + "3400935620699", + "3400936907553", + "3400934991295", + "3400935620170", + "3400930411230", + "3400931959328", + "3400936206830", + "3400936248182", + "3400936247123", + "3400936970366", + "3400935067340", + "3400935349897", + "3400938127676", + "3400938227024", + "3400931959038", + "3400931958956", + "3400935486554", + "3400931959496", + "3400935156440", + "3400935155788", + "3400938231045", + "3400936969247", + "3400936968936", + "3400936672598", + "3400935438478", + "3400927321214", + "3400938228373", + "3400936247642", + "3400936206779", + "3400936102651", + "3400927607035", + "3400927562396", + "3400927597237", + "3400935888730", + "3400936289406", + "3400936289284", + "3400938399097", + "3400939826240", + "3400936969995", + "3400938222920", + "3400936812420", + "3400936810709", + "3400936809758", + "3400936748804", + "3400933220754", + "3400933304652", + "3400933319090", + "3400934976339", + "3400934976278", + "3400936041295", + "3400936041417", + "3400938212983", + "3400933316778", + "3400935486783", + "3400935404640", + "3400934499760", + "3400936289635", + "3400935567031", + "3400935438249", + "3400922096032", + "3400922096322", + "3400936141391", + "3400935748713", + "3400935748652", + "3400936787193", + "3400934654329", + "3400934654039", + "3400938552454", + "3400935350039", + "3400937356473", + "3400938675443", + "3400938675214", + "3400935349958", + "3400937356305", + "3400937700177", + "3400935856913", + "3400934654787", + "3400934654558", + "3400930303320", + "3400934007286", + "3400935194121", + "3400934882845", + "3400935877505", + "3400935193810", + "3400937051323", + "3400937015943", + "3400933180850", + "3400930002285", + "3400930002278", + "3400926690908", + "3400926963422", + "3400938459852", + "3400939846460", + "3400934890659", + "3400939392325", + "3400934890888", + "3400933911881", + "3400934802133", + "3400938042887", + "3400938042597", + "3400939903392", + "3400939641706", + "3400939711874", + "3400938042139", + "3400921856996", + "3400921856828", + "3400922198088", + "3400921857658", + "3400921879599", + "3400949251087", + "3400949250547", + "3400949914821", + "3400949217540", + "3400949217250", + "3400949217199", + "3400930332016", + "3400939202099", + "3400939641355", + "3400939640983", + "3400939711935", + "3400938510232", + "3400927560446", + "3400927658532", + "3400927657702", + "3400934410932", + "3400934399435", + "3400934399084", + "3400934398544", + "3400934291463", + "3400934291234", + "3400934387609", + "3400934387258", + "3400927656989", + "3400927748752", + "3400927760587", + "3400927759239", + "3400927757976", + "3400939220710", + "3400938509342", + "3400938509861", + "3400934890420", + "3400932869947", + "3400927659591", + "3400938458732", + "3400935438300", + "3400936587281", + "3400936985155", + "3400935695161", + "3400936911635", + "3400936911055", + "3400936910683", + "3400939200668", + "3400939104355", + "3400938460223", + "3400938533415", + "3400938509113", + "3400938124255", + "3400939024479", + "3400938398618", + "3400926961121", + "3400939104584", + "3400939104416", + "3400939105765", + "3400939105185", + "3400939104935", + "3400939104706", + "3400934053702", + "3400938508741", + "3400938508512", + "3400938652840", + "3400938652321", + "3400927756337", + "3400927753725", + "3400927751653", + "3400939221021", + "3400927755095", + "3400931164531", + "3400934238536", + "3400934238307", + "3400934021305", + "3400934641541", + "3400922301938", + "3400934238475", + "3400936910515", + "3400936102422", + "3400936894563", + "3400934866067", + "3400935322982", + "3400922096261", + "3400935349729", + "3400949363940", + "3400939190051", + "3400936907782", + "3400936906891", + "3400936906372", + "3400936906143", + "3400936905771", + "3400935807366", + "3400935806826", + "3400927890031", + "3400938651089", + "3400938650488", + "3400938225532", + "3400936794238", + "3400926942489", + "3400935857392", + "3400935154729", + "3400939200439", + "3400934387487", + "3400933684792", + "3400934748042", + "3400939104645", + "3400939221489", + "3400939221311", + "3400939221250", + "3400939221199", + "3400939220888", + "3400939213507", + "3400939213446", + "3400932952724", + "3400932870028", + "3400932869718", + "3400922303420", + "3400922303079", + "3400922302768", + "3400922302300", + "3400921856477", + "3400935703477", + "3400930777473", + "3400935018663", + "3400930777534", + "3400933323813", + "3400933323752", + "3400935130594", + "3400939186788", + "3400936910454", + "3400936910225", + "3400949217489", + "3400939187679", + "3400935998651", + "3400932869886", + "3400949914531", + "3400934890130", + "3400927560965", + "3400927561337", + "3400927888601", + "3400935806307", + "3400935349378", + "3400939185668", + "3400939726205", + "3400939725192", + "3400935619921", + "3400935844217", + "3400935843845", + "3400936203068", + "3400936242388", + "3400936242159", + "3400936241909", + "3400935422279", + "3400927656750", + "3400939314952", + "3400939712017", + "3400921857948", + "3400949251209", + "3400949378678", + "3400934827846", + "3400922095950", + "3400936809178", + "3400930068571", + "3400921857139", + "3400949812134", + "3400949666621", + "3400949666331", + "3400949915590", + "3400933323691", + "3400933323523", + "3400931164821", + "3400939391144", + "3400939390543", + "3400939342757", + "3400939844510", + "3400930068519", + "3400930068649", + "3400921857887", + "3400921857719", + "3400921857597", + "3400921856767", + "3400921856538", + "3400921857368", + "3400921857078", + "3400949914302", + "3400938504897", + "3400935429483", + "3400935421500", + "3400933480059", + "3400933479978", + "3400933799229", + "3400927889721", + "3400927889370", + "3400927889080", + "3400927561108", + "3400939827070", + "3400935595874", + "3400935615558", + "3400936651548", + "3400934760228", + "3400939755588", + "3400939755878", + "3400939476803", + "3400939417899", + "3400939417370", + "3400932551897", + "3400930075722", + "3400933803681", + "3400939478173", + "3400939476223", + "3400939479415", + "3400935714244", + "3400939118833", + "3400926939359", + "3400927656699", + "3400935509369", + "3400936587632", + "3400930051047", + "3400933803452", + "3400939477404", + "3400939825649", + "3400930057834", + "3400932966332", + "3400935185884", + "3400935107534", + "3400935660091", + "3400935108074", + "3400935107183", + "3400935671677", + "3400930068960", + "3400930068892", + "3400927657641", + "3400926845117", + "3400927656811", + "3400927658471", + "3400927655630", + "3400927655920", + "3400927659423", + "3400930068823", + "3400935982452", + "3400927656002", + "3400930076033", + "3400930075937", + "3400930075623", + "3400937847698", + "3400934827617", + "3400935703767", + "3400930587508", + "3400935065100", + "3400935703248", + "3400921879711", + "3400936969537", + "3400936595965", + "3400949214587", + "3400936819344", + "3400935531650", + "3400931308492", + "3400939478814", + "3400939213736", + "3400921857429", + "3400936289055", + "3400935235893", + "3400930075388", + "3400934170195", + "3400933305314", + "3400932461219", + "3400939391892", + "3400939843629", + "3400927660313", + "3400927560675", + "3400930587737", + "3400935299130", + "3400938324747", + "3400933724467", + "3400938149500", + "3400936690349", + "3400935157041", + "3400935236036", + "3400936968646", + "3400937374514", + "3400938509571", + "3400927513541", + "3400935414007", + "3400926721466", + "3400926838072", + "3400935583239", + "3400935404701", + "3400935565839", + "3400930014103", + "3400930045350", + "3400939699257", + "3400938458442", + "3400935420909", + "3400927655869", + "3400935713063", + "3400933275815", + "3400937015653", + "3400949217311", + "3400939200729", + "3400934300660", + "3400936212053", + "3400938125955", + "3400935570970", + "3400936289925", + "3400939845340", + "3400936906204", + "3400936105034", + "3400939723471", + "3400935666826", + "3400935843494", + "3400935768872", + "3400938508970", + "3400938651720", + "3400936853751", + "3400939104294", + "3400934238765", + "3400949363599", + "3400939220949", + "3400933765910", + "3400927660252", + "3400935248442", + "3400926943141", + "3400935694911", + "3400939185897" + ) + override val pharmacologicalClasses: List[PharmacologicalClassConfig] = List.empty +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/ProtonPumpInhibitors.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/ProtonPumpInhibitors.scala new file mode 100644 index 00000000..5fa3c333 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/classification/families/ProtonPumpInhibitors.scala @@ -0,0 +1,968 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.families + +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} + +object ProtonPumpInhibitors extends DrugClassConfig { + override val name: String = "ProtonPumpInhibitors" + override val cip13Codes: Set[String] = Set( + "3400949001279", + "3400938104905", + "3400938268706", + "3400938278170", + "3400938277920", + "3400938277579", + "3400938277340", + "3400949766208", + "3400949015856", + "3400949015795", + "3400949337118", + "3400949336456", + "3400949001637", + "3400949011834", + "3400938103205", + "3400949011544", + "3400936291706", + "3400936291584", + "3400936290754", + "3400936287792", + "3400936288515", + "3400936287914", + "3400934868139", + "3400941787461", + "3400949337286", + "3400927818172", + "3400927823206", + "3400922392486", + "3400939074566", + "3400936747333", + "3400936884397", + "3400949014217", + "3400938104035", + "3400938144307", + "3400949001064", + "3400949498765", + "3400941889578", + "3400949000500", + "3400949001057", + "3400941788642", + "3400941783388", + "3400949001033", + "3400949001026", + "3400949001019", + "3400938954401", + "3400937530880", + "3400936483163", + "3400936676732", + "3400936674431", + "3400936673779", + "3400936673250", + "3400921838329", + "3400937827997", + "3400938782462", + "3400936747272", + "3400921854817", + "3400938112887", + "3400949946419", + "3400949968879", + "3400949941155", + "3400938106107", + "3400941859014", + "3400927481741", + "3400939307688", + "3400921850215", + "3400949117154", + "3400949117093", + "3400949116263", + "3400941902109", + "3400941889349", + "3400941888168", + "3400936887411", + "3400936887350", + "3400936884168", + "3400936883918", + "3400936883857", + "3400936805446", + "3400936805217", + "3400936676961", + "3400921992304", + "3400949004799", + "3400949003440", + "3400941787522", + "3400941607981", + "3400941784569", + "3400921994773", + "3400949008643", + "3400936804784", + "3400941904868", + "3400938119343", + "3400930003060", + "3400938142693", + "3400938140163", + "3400938112139", + "3400938112078", + "3400938109986", + "3400938109696", + "3400936483514", + "3400941611834", + "3400941610073", + "3400941609244", + "3400941608292", + "3400949008124", + "3400949007523", + "3400938106336", + "3400949015917", + "3400936747562", + "3400935674289", + "3400936281820", + "3400935842374", + "3400935974419", + "3400936284432", + "3400936886759", + "3400949001330", + "3400949000517", + "3400939213156", + "3400939524689", + "3400936883796", + "3400936282940", + "3400939072784", + "3400922389875", + "3400938118452", + "3400936804036", + "3400936483682", + "3400938117103", + "3400949761005", + "3400949946358", + "3400949001354", + "3400949001347", + "3400941781209", + "3400941781087", + "3400941780837", + "3400938117790", + "3400922489223", + "3400927607493", + "3400926663872", + "3400949699858", + "3400949970889", + "3400936288744", + "3400934081217", + "3400934130106", + "3400934052750", + "3400934455186", + "3400934455247", + "3400926957278", + "3400927568022", + "3400927567711", + "3400927567650", + "3400922296005", + "3400934415555", + "3400934415494", + "3400930011638", + "3400921927948", + "3400921928198", + "3400921848083", + "3400921847543", + "3400926955786", + "3400926954086", + "3400935585011", + "3400930060735", + "3400930060704", + "3400922295923", + "3400922296463", + "3400922197609", + "3400921836776", + "3400921611212", + "3400921633672", + "3400933839352", + "3400934869259", + "3400934869020", + "3400933839291", + "3400933839871", + "3400930011515", + "3400930002926", + "3400930011508", + "3400930060193", + "3400926823405", + "3400926823283", + "3400934963841", + "3400934475757", + "3400934868429", + "3400922197548", + "3400922197319", + "3400922403526", + "3400922095431", + "3400930060179", + "3400930060698", + "3400930060247", + "3400921636345", + "3400921634914", + "3400935674401", + "3400936862746", + "3400949762644", + "3400949719181", + "3400926816889", + "3400926815530", + "3400926957568", + "3400927359583", + "3400927359415", + "3400922037820", + "3400922037530", + "3400922037479", + "3400939030272", + "3400933341213", + "3400933341152", + "3400936323094", + "3400935533951", + "3400930066409", + "3400930066386", + "3400930066379", + "3400935585479", + "3400935584939", + "3400935584588", + "3400935533432", + "3400935533203", + "3400935585301", + "3400922037240", + "3400922036069", + "3400922034287", + "3400922033808", + "3400935584298", + "3400934706639", + "3400934081095", + "3400921849844", + "3400921852516", + "3400921852165", + "3400935584700", + "3400935584410", + "3400936282360", + "3400937991339", + "3400930070604", + "3400930070598", + "3400930070574", + "3400934456077", + "3400936871731", + "3400936871670", + "3400936871441", + "3400934410413", + "3400934303562", + "3400934081446", + "3400921928259", + "3400921995954", + "3400921994315", + "3400921998627", + "3400922000459", + "3400921993653", + "3400921992823", + "3400949766376", + "3400936668119", + "3400936282889", + "3400936805965", + "3400922463438", + "3400926955908", + "3400939389653", + "3400938778502", + "3400938776379", + "3400949721542", + "3400949721481", + "3400938991598", + "3400938277869", + "3400927932694", + "3400936288683", + "3400949766666", + "3400949718061", + "3400949755370", + "3400949494514", + "3400949759804", + "3400949495573", + "3400927823374", + "3400949718412", + "3400949718351", + "3400949702091", + "3400949701612", + "3400949494743", + "3400949757152", + "3400949758791", + "3400949758562", + "3400949757381", + "3400949487189", + "3400949970421", + "3400949946297", + "3400949946129", + "3400927931925", + "3400927931116", + "3400927930683", + "3400927871399", + "3400927871221", + "3400927929854", + "3400935881694", + "3400939307749", + "3400939307510", + "3400939255439", + "3400939541778", + "3400939271286", + "3400939322018", + "3400949754021", + "3400938703979", + "3400938703740", + "3400939075228", + "3400939030623", + "3400939073095", + "3400939030562", + "3400938117561", + "3400938117271", + "3400938119633", + "3400938119572", + "3400938112948", + "3400937827188", + "3400938118681", + "3400939321646", + "3400936287105", + "3400936287044", + "3400934549021", + "3400934706868", + "3400934705519", + "3400934097935", + "3400939030333", + "3400934455766", + "3400938991710", + "3400934705229", + "3400941784279", + "3400939072555", + "3400949003099", + "3400941787751", + "3400934130045", + "3400927931284", + "3400927929915", + "3400927366796", + "3400926665654", + "3400927418358", + "3400939239699", + "3400939321936", + "3400937991278", + "3400949719532", + "3400949718870", + "3400930011669", + "3400927607264", + "3400935533371", + "3400949968411", + "3400936284661", + "3400938193503", + "3400938108866", + "3400934606533", + "3400938104844", + "3400922402406", + "3400949753949", + "3400949718122", + "3400949498536", + "3400936287273", + "3400949755202", + "3400949719471", + "3400926953775", + "3400949946068", + "3400949945986", + "3400949946877", + "3400949946709", + "3400949946648", + "3400949941445", + "3400922462776", + "3400922465210", + "3400926668327", + "3400926667894", + "3400926661281", + "3400926663414", + "3400939541198", + "3400939320816", + "3400939320755", + "3400939213385", + "3400939213217", + "3400939213095", + "3400939239989", + "3400949767786", + "3400939030104", + "3400939072326", + "3400938781052", + "3400938780802", + "3400938778380", + "3400938785364", + "3400949939893", + "3400938118513", + "3400938118162", + "3400938194333", + "3400937974998", + "3400938194104", + "3400938193732", + "3400939541549", + "3400938784015", + "3400938186529", + "3400922403816", + "3400926951306", + "3400935514790", + "3400936290525", + "3400936282421", + "3400936668287", + "3400934304163", + "3400927607615", + "3400934415784", + "3400934475696", + "3400922491233", + "3400922490281", + "3400927417818", + "3400939322186", + "3400937530941", + "3400941902048", + "3400949026272", + "3400949014156", + "3400926661632", + "3400922489452", + "3400938407525", + "3400921634792", + "3400941783159", + "3400930060223", + "3400949000838", + "3400930011539", + "3400934097126", + "3400922296173", + "3400949769278", + "3400949769049", + "3400949767908", + "3400949498246", + "3400949498017", + "3400949762415", + "3400949000494", + "3400927357572", + "3400922491691", + "3400922491462", + "3400922490403", + "3400927824326", + "3400927823435", + "3400949721771", + "3400939257679", + "3400939257440", + "3400939541488", + "3400939540948", + "3400939656953", + "3400939656724", + "3400949761463", + "3400938778151", + "3400938776140", + "3400938993660", + "3400935941879", + "3400937828369", + "3400937828598", + "3400927357343", + "3400927352898", + "3400926766870", + "3400926766351", + "3400927607325", + "3400927481512", + "3400939389882", + "3400937828079", + "3400939075167", + "3400939080710", + "3400941785511", + "3400922036298", + "3400927357404", + "3400936482852", + "3400927822544", + "3400926766580", + "3400922033976", + "3400926822163", + "3400926815998", + "3400927818004", + "3400927817922", + "3400927822483", + "3400936482913", + "3400949721023", + "3400949720880", + "3400949720651", + "3400949719242", + "3400949721313", + "3400949718702", + "3400927417986", + "3400938785074", + "3400949761173", + "3400949717811", + "3400949717699", + "3400949765256", + "3400949765027", + "3400927821714", + "3400927821653", + "3400927800481", + "3400927568190", + "3400949718580", + "3400939388472", + "3400939388243", + "3400936323216", + "3400934476068", + "3400938408416", + "3400939816074", + "3400949763535", + "3400949760053", + "3400937827829", + "3400937827768", + "3400938116731", + "3400938116502", + "3400938113600", + "3400927481451", + "3400927481390", + "3400927418129", + "3400927418068", + "3400927417757", + "3400927933295", + "3400927933127", + "3400939815763", + "3400938113259", + "3400938113020", + "3400939079998", + "3400921609370", + "3400938175172", + "3400949339648", + "3400936324336", + "3400921636574", + "3400921628999", + "3400949341948", + "3400941657320", + "3400941896613", + "3400936806627", + "3400921607949", + "3400941895371", + "3400921793390", + "3400927932175", + "3400936290235", + "3400921851335", + "3400936484115", + "3400927930973", + "3400936673830", + "3400921792560", + "3400936281240", + "3400934615528", + "3400936325975", + "3400949766727", + "3400949766437", + "3400936672949", + "3400949011773", + "3400949022311", + "3400949022250", + "3400949341887", + "3400949341719", + "3400949341368", + "3400949341078", + "3400941904400", + "3400949016686", + "3400949016457", + "3400949623228", + "3400941659041", + "3400941658969", + "3400941895661", + "3400941894312", + "3400941893131", + "3400941892998", + "3400941892820", + "3400949338986", + "3400949338818", + "3400941658730", + "3400941657498", + "3400949943456", + "3400949021598", + "3400949010073", + "3400921636284", + "3400949943395", + "3400930000755", + "3400927873690", + "3400949026562", + "3400949026333", + "3400949015047", + "3400949014965", + "3400949014675", + "3400921627701", + "3400921627589", + "3400949011605", + "3400949948079", + "3400921849035", + "3400921609660", + "3400927368110", + "3400934455827", + "3400935583987", + "3400927930805", + "3400939257389", + "3400939307220", + "3400938174342", + "3400921611502", + "3400949948130", + "3400926802462", + "3400936281189", + "3400926814410", + "3400935584069", + "3400936480551", + "3400949948369", + "3400949339709", + "3400939146348", + "3400949022021", + "3400939154503", + "3400939151250", + "3400936324626", + "3400922037301", + "3400921790559", + "3400939655482", + "3400936325227", + "3400927884467", + "3400949014736", + "3400949016518", + "3400949947997", + "3400938105155", + "3400938140224", + "3400922471761", + "3400938111828", + "3400936325807", + "3400936325395", + "3400936325166", + "3400937974769", + "3400938268416", + "3400936480612", + "3400936583610", + "3400936677043", + "3400936674080", + "3400936673540", + "3400938140453", + "3400936484283", + "3400936673199", + "3400936673021", + "3400936672710", + "3400936388611", + "3400936388550", + "3400936480490", + "3400936672659", + "3400936668577", + "3400938174922", + "3400938174052", + "3400936668409", + "3400936668348", + "3400939080888", + "3400939389592", + "3400933839932", + "3400939657325", + "3400939151779", + "3400939657264", + "3400939657035", + "3400939154732", + "3400939154442", + "3400939151601", + "3400936356153", + "3400939151489", + "3400939321707", + "3400939145228", + "3400939307398", + "3400938119282", + "3400938119114", + "3400939146287", + "3400939080130", + "3400938119053", + "3400939154213", + "3400939154091", + "3400934095924", + "3400935533661", + "3400935533142", + "3400934303913", + "3400935533081", + "3400936280939", + "3400936290983", + "3400936290006", + "3400936034372", + "3400927933417", + "3400922390826", + "3400921623796", + "3400921793222", + "3400921901405", + "3400936323674", + "3400921640137", + "3400921639995", + "3400921638707", + "3400921633443", + "3400921633382", + "3400921604818", + "3400926802004", + "3400926800802", + "3400926813529", + "3400926813178", + "3400921626698", + "3400921626469", + "3400926664992", + "3400926662813", + "3400927931406", + "3400927884238", + "3400927884009", + "3400921848953", + "3400921780093", + "3400922471532", + "3400921638646", + "3400921637694", + "3400949948420", + "3400927930515", + "3400927930393", + "3400927871689", + "3400921851274", + "3400921779493", + "3400921779325", + "3400921623567", + "3400921611380", + "3400921608199", + "3400936583498", + "3400936672888", + "3400922296234", + "3400927800832", + "3400921607888", + "3400921622447", + "3400921606010", + "3400922392196", + "3400922393438", + "3400922393087", + "3400921632552", + "3400922000510", + "3400922000398", + "3400921906837", + "3400936324565", + "3400921632323", + "3400921630251", + "3400921629941", + "3400927932236", + "3400927920639", + "3400930002896", + "3400930000724", + "3400930003039", + "3400922470702", + "3400930015384", + "3400934475528", + "3400935533722", + "3400938268584", + "3400938103953", + "3400936673311", + "3400941610424", + "3400941788413", + "3400949117963", + "3400936483743", + "3400936673601", + "3400938785654", + "3400921928020", + "3400922036359", + "3400936291645", + "3400949004621", + "3400935514561", + "3400930027288", + "3400949946587", + "3400949116324", + "3400941888458", + "3400936887299", + "3400936805385", + "3400949337347", + "3400941785689", + "3400938105445", + "3400936805736", + "3400941609824", + "3400921623338", + "3400949336517", + "3400939524450", + "3400936290815", + "3400936287853", + "3400938108576", + "3400922033747", + "3400949763993", + "3400949721191", + "3400939255729", + "3400939239521", + "3400936290464", + "3400926951535", + "3400949700202", + "3400949755141", + "3400938673081", + "3400939072845", + "3400949007691", + "3400936804845", + "3400938954340", + "3400936862685", + "3400936281530", + "3400936286214", + "3400941906299", + "3400949118793", + "3400926822453", + "3400936674370", + "3400937991568", + "3400934707230", + "3400934096754", + "3400934606182", + "3400938118223", + "3400937531191", + "3400934052460", + "3400939320694", + "3400939239811", + "3400939656892", + "3400921928310", + "3400926765750", + "3400949720712", + "3400949719013", + "3400936323155", + "3400939815824", + "3400938139853", + "3400938109757", + "3400941888229", + "3400927824494", + "3400949001323", + "3400949001286", + "3400936674202", + "3400936357624", + "3400936357334", + "3400936357273", + "3400939074276", + "3400930066423", + "3400941859533", + "3400926814700", + "3400937827539", + "3400949498994", + "3400949721832", + "3400938782004", + "3400922037769", + "3400935533890", + "3400935584878", + "3400938784305", + "3400922491752", + "3400927846847", + "3400922034348", + "3400930070567", + "3400936871380", + "3400927932526", + "3400922197487", + "3400949946938", + "3400927931864", + "3400927870910", + "3400934410642", + "3400949940264", + "3400938117912", + "3400938186758", + "3400949001644", + "3400921854466", + "3400927870859", + "3400949768967", + "3400949495863", + "3400938993899", + "3400937828130", + "3400922035987", + "3400934096006", + "3400927817861", + "3400927821592", + "3400926955847", + "3400930060728", + "3400936886810", + "3400927352959", + "3400926954376", + "3400949717750", + "3400949763764", + "3400938117042", + "3400927607554", + "3400934303333", + "3400921995664", + "3400921994025", + "3400938113549", + "3400934475818", + "3400934098017", + "3400927568251", + "3400922197777", + "3400934963261", + "3400922146850", + "3400926817190", + "3400927366338", + "3400927481680", + "3400927418419", + "3400938782172", + "3400938105094", + "3400949758333", + "3400949947010", + "3400938119862", + "3400937827300", + "3400921837896", + "3400921836547", + "3400930060216", + "3400922465678", + "3400926666026", + "3400927801082", + "3400930027301", + "3400935584120", + "3400930060711", + "3400934548420", + "3400921605877", + "3400949009763", + "3400938991420", + "3400922470870", + "3400949026104", + "3400949021949", + "3400939152080", + "3400941906060", + "3400949022199", + "3400939655543", + "3400934417566", + "3400941896842", + "3400938113488", + "3400949341139", + "3400941895203", + "3400937974820", + "3400935973467", + "3400934601910", + "3400921628760", + "3400927932984", + "3400939307459", + "3400921604757", + "3400941904578", + "3400938174113", + "3400936677104", + "3400936673489", + "3400936325746", + "3400936484344", + "3400936668638", + "3400941894190", + "3400936668058", + "3400936290693", + "3400936290174", + "3400939146577", + "3400936803954", + "3400949943517", + "3400939144917", + "3400939079820", + "3400938194043", + "3400935532831", + "3400936388499", + "3400927933585", + "3400934417337", + "3400936674141", + "3400922389585", + "3400927368578", + "3400927800542", + "3400927801143", + "3400921792621", + "3400936356443", + "3400936356214", + "3400936323964", + "3400936323735", + "3400936583559", + "3400921779905", + "3400927932816", + "3400921622218", + "3400921638936", + "3400921605068", + "3400926801113", + "3400934096983", + "3400927931574", + "3400927883927", + "3400921637816", + "3400927930225", + "3400922393148", + "3400921630022", + "3400922392028", + "3400922391137", + "3400921627411" + ) + val ipp = new PharmacologicalClassConfig( + name = "IPP", + ATCCodes = List("A02BC*") + ) + override val pharmacologicalClasses: List[PharmacologicalClassConfig] = List(ipp) + +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/Cip13Level.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/Cip13Level.scala similarity index 72% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/Cip13Level.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/Cip13Level.scala index f9587f11..ae7177af 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/Cip13Level.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/Cip13Level.scala @@ -1,9 +1,9 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.drugs.level +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level import org.apache.spark.sql.Row -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.DrugClassConfig +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.DrugClassConfig object Cip13Level extends DrugClassificationLevel { diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/DrugClassificationLevel.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/DrugClassificationLevel.scala similarity index 78% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/DrugClassificationLevel.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/DrugClassificationLevel.scala index 101c872d..f4cd3476 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/DrugClassificationLevel.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/DrugClassificationLevel.scala @@ -1,9 +1,9 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.drugs.level +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level import org.apache.spark.sql.Row -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.DrugClassConfig +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.DrugClassConfig trait DrugClassificationLevel extends Serializable { diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/MoleculeCombinationLevel.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/MoleculeCombinationLevel.scala similarity index 72% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/MoleculeCombinationLevel.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/MoleculeCombinationLevel.scala index b44d7164..e4bf42d7 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/MoleculeCombinationLevel.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/MoleculeCombinationLevel.scala @@ -1,9 +1,9 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.drugs.level +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level import org.apache.spark.sql.Row -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.DrugClassConfig +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.DrugClassConfig object MoleculeCombinationLevel extends DrugClassificationLevel { diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/PharmacologicalLevel.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/PharmacologicalLevel.scala similarity index 81% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/PharmacologicalLevel.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/PharmacologicalLevel.scala index 0160af29..a5174762 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/PharmacologicalLevel.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/PharmacologicalLevel.scala @@ -1,9 +1,9 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.drugs.level +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level import org.apache.spark.sql.Row -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.DrugClassConfig +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.DrugClassConfig object PharmacologicalLevel extends DrugClassificationLevel { diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/TherapeuticLevel.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/TherapeuticLevel.scala similarity index 78% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/TherapeuticLevel.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/TherapeuticLevel.scala index 6465ca32..34f71867 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/TherapeuticLevel.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/TherapeuticLevel.scala @@ -1,9 +1,9 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.drugs.level +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level import org.apache.spark.sql.Row -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.DrugClassConfig +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.DrugClassConfig object TherapeuticLevel extends DrugClassificationLevel { diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/HadHospitalStaysExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/HadHospitalStaysExtractor.scala new file mode 100644 index 00000000..5af686c0 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/HadHospitalStaysExtractor.scala @@ -0,0 +1,22 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.hospitalstays + +import java.sql.{Date, Timestamp} +import org.apache.spark.sql.Row +import fr.polytechnique.cmap.cnam.etl.events.{EventBuilder, HadHospitalStay, HospitalStay} +import fr.polytechnique.cmap.cnam.etl.extractors.AlwaysTrueStrategy +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.had.HadSimpleExtractor + +object HadHospitalStaysExtractor extends HadSimpleExtractor[HospitalStay] + with AlwaysTrueStrategy[HospitalStay] { + override val columnName: String = ColNames.EndDate + override val eventBuilder: EventBuilder = HadHospitalStay + + override def extractValue(row: Row): String = extractGroupId(row) + + override def extractEnd(r: Row): Option[Timestamp] = Some(new Timestamp(r.getAs[Date](ColNames.EndDate).getTime)) + + override def getCodes: SimpleExtractorCodes = SimpleExtractorCodes.empty +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/McoHospitalStaysExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/McoHospitalStaysExtractor.scala new file mode 100644 index 00000000..8870959e --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/McoHospitalStaysExtractor.scala @@ -0,0 +1,51 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.hospitalstays + +import java.sql.{Date, Timestamp} +import scala.util.Try +import org.apache.spark.sql.Row +import fr.polytechnique.cmap.cnam.etl.events.{EventBuilder, HospitalStay, McoHospitalStay} +import fr.polytechnique.cmap.cnam.etl.extractors.AlwaysTrueStrategy +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.mco.McoSimpleExtractor + +object McoHospitalStaysExtractor extends McoSimpleExtractor[HospitalStay] with AlwaysTrueStrategy[HospitalStay] { + + override def getCodes: SimpleExtractorCodes = SimpleExtractorCodes.empty + + override def columnName: String = ColNames.EndDate + + override def eventBuilder: EventBuilder = McoHospitalStay + + override def neededColumns: List[String] = List(ColNames.StayFrom, ColNames.StayFromType) ++ super.usedColumns + + override def extractEnd(r: Row): Option[Timestamp] = Some(new Timestamp(r.getAs[Date](ColNames.EndDate).getTime)) + + override def extractStart(r: Row): Timestamp = new Timestamp(r.getAs[Date](ColNames.StartDate).getTime) + + override def extractValue(row: Row): String = extractGroupId(row) + + override def extractWeight(r: Row): Double = { + getFromValue(r).flatMap(from => getFromType(r).map(fromType => from + fromType * 0.1)) recover { case _ => -1D } + }.get + + private def getFromValue(r: Row): Try[Double] = { + Try { + r.getAs[String](ColNames.StayFrom).toDouble + } + } + + private def getFromType(r: Row): Try[Double] = { + + val isNull = (s: String) => s == null || s.trim.isEmpty + + Try { + r.getAs[String](ColNames.StayFromType) match { + case value if isNull(value) => 0D + case "R" => 8D + case value => value.toDouble + } + } + } +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/McoceEmergenciesExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/McoceEmergenciesExtractor.scala new file mode 100644 index 00000000..16d450cb --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/McoceEmergenciesExtractor.scala @@ -0,0 +1,30 @@ +package fr.polytechnique.cmap.cnam.etl.extractors.events.hospitalstays + +import java.sql.{Date, Timestamp} +import org.apache.spark.sql.Row +import fr.polytechnique.cmap.cnam.etl.events.{EventBuilder, HospitalStay, McoceEmergency} +import fr.polytechnique.cmap.cnam.etl.extractors.AlwaysTrueStrategy +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.mcoce.McoCeSimpleExtractor + +object McoceEmergenciesExtractor extends McoCeSimpleExtractor[HospitalStay] with AlwaysTrueStrategy[HospitalStay] { + + // Extractor trait + override def isInExtractorScope(row: Row): Boolean = !row.isNullAt(row.fieldIndex(ColNames.ActCode)) && row + .getAs[String](ColNames.ActCode).startsWith("ATU") + + override def getCodes: SimpleExtractorCodes = SimpleExtractorCodes.empty + + // EventRowExtractor trait + override def extractEnd(r: Row): Option[Timestamp] = Some(new Timestamp(r.getAs[Date](ColNames.EndDate).getTime)) + + override def extractValue(row: Row): String = extractGroupId(row) + + override def usedColumns: List[String] = List(ColNames.EndDate) ++ super.usedColumns + + // SimpleExtractor trait + override def columnName: String = ColNames.ActCode + + override def eventBuilder: EventBuilder = McoceEmergency + +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/hospitalstays/SsrHospitalStaysExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/SsrHospitalStaysExtractor.scala similarity index 51% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/hospitalstays/SsrHospitalStaysExtractor.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/SsrHospitalStaysExtractor.scala index e8b383f0..19ac2c6b 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/hospitalstays/SsrHospitalStaysExtractor.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/SsrHospitalStaysExtractor.scala @@ -1,14 +1,15 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.hospitalstays +// License: BSD 3 clause -import java.sql.{Date, Timestamp} +package fr.polytechnique.cmap.cnam.etl.extractors.events.hospitalstays +import java.sql.{Date, Timestamp} +import org.apache.spark.sql.Row import fr.polytechnique.cmap.cnam.etl.events.{EventBuilder, HospitalStay, SsrHospitalStay} -import fr.polytechnique.cmap.cnam.etl.extractors.ssr.SsrExtractor -import fr.polytechnique.cmap.cnam.etl.sources.Sources -import org.apache.spark.sql.functions.col -import org.apache.spark.sql.{DataFrame, Row} +import fr.polytechnique.cmap.cnam.etl.extractors.AlwaysTrueStrategy +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.ssr.SsrSimpleExtractor -object SsrHospitalStaysExtractor extends SsrExtractor[HospitalStay] { +object SsrHospitalStaysExtractor extends SsrSimpleExtractor[HospitalStay] with AlwaysTrueStrategy[HospitalStay] { override val columnName: String = ColNames.EndDate override val eventBuilder: EventBuilder = SsrHospitalStay @@ -16,15 +17,13 @@ object SsrHospitalStaysExtractor extends SsrExtractor[HospitalStay] { override def extractStart(r: Row): Timestamp = new Timestamp(r.getAs[Date](ColNames.StartDate).getTime) - override def isInStudy(codes: Set[String])(row: Row): Boolean = true - - override def code: Row => String = extractGroupId - - override def getInput(sources: Sources): DataFrame = sources.ssr.get.select(ColNames.hospitalStayPart.map(col): _*) + override def extractValue(row: Row): String = extractGroupId(row) override def extractGroupId(r: Row): String = { r.getAs[String](ColNames.EtaNum) + "_" + r.getAs[String](ColNames.RhaNum) + "_" + r.getAs[Int](ColNames.Year).toString } + + override def getCodes: SimpleExtractorCodes = SimpleExtractorCodes.empty } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/molecules/DcirMoleculePurchases.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/molecules/DcirMoleculePurchases.scala similarity index 85% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/molecules/DcirMoleculePurchases.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/molecules/DcirMoleculePurchases.scala index 35a3e332..4558a078 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/molecules/DcirMoleculePurchases.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/molecules/DcirMoleculePurchases.scala @@ -1,10 +1,10 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.molecules +package fr.polytechnique.cmap.cnam.etl.extractors.events.molecules import java.sql.Timestamp import org.apache.spark.sql.expressions.Window -import org.apache.spark.sql.functions._ +import org.apache.spark.sql.functions.{col, sum, udf, when} import org.apache.spark.sql.types.{DoubleType, StringType, TimestampType} import org.apache.spark.sql.{Column, DataFrame, Row} import fr.polytechnique.cmap.cnam.etl.events.{Event, Molecule} @@ -12,15 +12,13 @@ import fr.polytechnique.cmap.cnam.etl.extractors.Extractor import fr.polytechnique.cmap.cnam.etl.sources.Sources import fr.polytechnique.cmap.cnam.util.DrugEventsTransformerHelper -class DcirMoleculePurchases(config: MoleculePurchasesConfig) extends Extractor[Molecule] { +class DcirMoleculePurchases(config: MoleculePurchasesConfig) extends Extractor[Molecule, MoleculePurchasesConfig] { - override def isInStudy(codes: Set[String])(row: Row): Boolean = - codes.contains(row.getAs[String](Columns.Category)) + override def isInStudy(row: Row): Boolean = config.drugClasses.contains(row.getAs[String](Columns.Category)) override def isInExtractorScope(row: Row): Boolean = !row.isNullAt(row.fieldIndex(Columns.EventDate)) && row.getAs[Int](Columns.NBoxes) > 0 - override def builder(row: Row): Seq[Event[Molecule]] = Seq( Molecule( getPatientID(row), @@ -30,6 +28,14 @@ class DcirMoleculePurchases(config: MoleculePurchasesConfig) extends Extractor[M ) ) + def getPatientID(row: Row): String = row.getAs[String](Columns.PatientID) + + def getValue(row: Row): String = row.getAs[String](Columns.MoleculeName) + + def getWeight(row: Row): Double = row.getAs[Double](Columns.TotalDose) + + def getEventDate(row: Row): Timestamp = row.getAs[Timestamp](Columns.EventDate) + override def getInput(sources: Sources): DataFrame = { val dcirInputColumns: List[Column] = List( col("NUM_ENQ").cast(StringType).as("patientID"), @@ -52,7 +58,8 @@ class DcirMoleculePurchases(config: MoleculePurchasesConfig) extends Extractor[M col("TOTAL_MG_PER_UNIT").cast(DoubleType).as("dosage") ) - val groupCols: List[Column] = List(col("patientID"), + val groupCols: List[Column] = List( + col("patientID"), col("moleculeName"), col("eventDate") ) @@ -66,9 +73,10 @@ class DcirMoleculePurchases(config: MoleculePurchasesConfig) extends Extractor[M val df = sources.dcir.get .select(dcirInputColumns: _*) - .withColumn(Columns.NBoxes, when(col(Columns.NBoxes) < 0, 0) - .when(col(Columns.NBoxes) > config.maxBoxQuantity, 0) - .otherwise(col(Columns.NBoxes)) + .withColumn( + Columns.NBoxes, when(col(Columns.NBoxes) < 0, 0) + .when(col(Columns.NBoxes) > config.maxBoxQuantity, 0) + .otherwise(col(Columns.NBoxes)) ) // get CIP07 drug val joinedByCIP07 = df @@ -88,14 +96,6 @@ class DcirMoleculePurchases(config: MoleculePurchasesConfig) extends Extractor[M .withColumn(Columns.TotalDose, sum(col(Columns.Dosage) * col(Columns.NBoxes)) over win) // Compute total dose } - def getPatientID(row: Row): String = row.getAs[String](Columns.PatientID) - - def getValue(row: Row): String = row.getAs[String](Columns.MoleculeName) - - def getWeight(row: Row): Double = row.getAs[Double](Columns.TotalDose) - - def getEventDate(row: Row): Timestamp = row.getAs[Timestamp](Columns.EventDate) - final object Columns extends Serializable { val PatientID = "patientID" val CIP07 = "CIP07" @@ -108,5 +108,6 @@ class DcirMoleculePurchases(config: MoleculePurchasesConfig) extends Extractor[M val TotalDose = "totalDose" } + override def getCodes: MoleculePurchasesConfig = config } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/molecules/MoleculePurchases.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/molecules/MoleculePurchases.scala similarity index 67% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/molecules/MoleculePurchases.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/molecules/MoleculePurchases.scala index eb2ca0fe..9d186d4b 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/molecules/MoleculePurchases.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/molecules/MoleculePurchases.scala @@ -1,6 +1,6 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.molecules +package fr.polytechnique.cmap.cnam.etl.extractors.events.molecules import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.etl.events._ @@ -9,6 +9,6 @@ import fr.polytechnique.cmap.cnam.etl.sources.Sources class MoleculePurchases(config: MoleculePurchasesConfig) { def extract(sources: Sources): Dataset[Event[Molecule]] = { - new DcirMoleculePurchases(config).extract(sources, config.drugClasses.toSet) + new DcirMoleculePurchases(config).extract(sources) } } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/molecules/MoleculePurchasesConfig.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/molecules/MoleculePurchasesConfig.scala similarity index 73% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/molecules/MoleculePurchasesConfig.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/molecules/MoleculePurchasesConfig.scala index 3a065cef..3be421fd 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/molecules/MoleculePurchasesConfig.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/molecules/MoleculePurchasesConfig.scala @@ -1,8 +1,9 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.molecules +package fr.polytechnique.cmap.cnam.etl.extractors.events.molecules import fr.polytechnique.cmap.cnam.etl.extractors.ExtractorConfig +import fr.polytechnique.cmap.cnam.etl.extractors.codes.ExtractorCodes /** * Base definition of the config needed by the MoleculePurchases extractor. @@ -12,7 +13,9 @@ import fr.polytechnique.cmap.cnam.etl.extractors.ExtractorConfig */ class MoleculePurchasesConfig( val drugClasses: List[String], - val maxBoxQuantity: Int) extends ExtractorConfig with Serializable + val maxBoxQuantity: Int) extends ExtractorConfig with ExtractorCodes { + override def isEmpty: Boolean = drugClasses.isEmpty +} object MoleculePurchasesConfig { diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/ngapacts/DcirNgapActExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/ngapacts/DcirNgapActExtractor.scala new file mode 100644 index 00000000..cd5dca44 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/ngapacts/DcirNgapActExtractor.scala @@ -0,0 +1,112 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.ngapacts + + +import org.apache.spark.sql.functions.col +import org.apache.spark.sql.{Column, DataFrame, Row} +import fr.polytechnique.cmap.cnam.etl.events.{DcirNgapAct, Event, EventBuilder, NgapAct} +import fr.polytechnique.cmap.cnam.etl.extractors.Extractor +import fr.polytechnique.cmap.cnam.etl.extractors.sources.dcir.DcirRowExtractor +import fr.polytechnique.cmap.cnam.etl.sources.Sources + +final case class DcirNgapActExtractor(ngapActsConfig: NgapActConfig[NgapWithNatClassConfig]) + extends Extractor[NgapAct, NgapActConfig[NgapWithNatClassConfig]] with DcirRowExtractor { + + val columnName: String = ColNames.NaturePrestation + val eventBuilder: EventBuilder = DcirNgapAct + val ngapKeyLetterCol: String = "PRS_NAT_CB2" + + final val PrivateInstitutionCodes = Set(4D, 5D, 6D, 7D) + + override def getInput(sources: Sources): DataFrame = { + + val neededColumns: List[Column] = List( + ColNames.PatientID, ColNames.NaturePrestation, ColNames.NgapCoefficient, + ColNames.DcirEventStart, ColNames.ExecPSNum, ColNames.DcirFluxDate, ngapKeyLetterCol, + ColNames.Sector, ColNames.GHSCode, ColNames.InstitutionCode + ).map(col) + + lazy val irNat = sources.irNat.get + lazy val dcir = sources.dcir.get + + lazy val df: DataFrame = dcir.join(irNat, dcir(ColNames.NaturePrestation).cast("String") === irNat("PRS_NAT")) + df.select(neededColumns: _*) + } + + override def isInExtractorScope(row: Row): Boolean = !row.isNullAt(row.fieldIndex(ngapKeyLetterCol)) + + override def isInStudy(row: Row): Boolean = { + + lazy val prsNatRef = row.getAs[Int](ColNames.NaturePrestation).toString + lazy val ngapKeyLetter = row.getAs[String](ngapKeyLetterCol) + lazy val ngapCoefficient = row.getAs[Double](ColNames.NgapCoefficient).toString + + ngapActsConfig.actsCategories + .exists( + category => { + category.ngapPrsNatRefs.contains(prsNatRef) || { + category.ngapKeyLetters.contains(ngapKeyLetter) && category.ngapCoefficients.contains(ngapCoefficient) + } + } + ) + } + + def builder(row: Row): Seq[Event[NgapAct]] = { + val patientId = extractPatientId(row) + val groupId = extractGroupId(row) + val value = extractValue(row) + val eventDate = extractStart(row) + val endDate = extractEnd(row) + val weight = extractWeight(row) + + Seq(eventBuilder[NgapAct](patientId, groupId, value, weight, eventDate, endDate)) + } + + /** + * We extract Ngap acts as a concatenation of three different ways to identify specific ngap acts in the SNDS : + * - prestation type (ngapPrsNatRefs: PRS_NAT_REF), + * - prestation coefficient (ngapKeyLetters : PRS_NAT_CB2 or ACT_COD in the PMSI_CE), + * - prestation coefficient (ngapCoefficients: PRS_ACT_CFT or ACT_COE in the PMSI_CE) + * + * For more information, Cf NgapActConfig documentation. + * + * @return concatenation of the three codes + */ + override def extractValue(row: Row): String = { + s"${row.getAs[Int](ColNames.NaturePrestation)}_${row.getAs[String](ngapKeyLetterCol)}_${ + row.getAs[Double](ColNames.NgapCoefficient).toString + }" + } + + override def extractGroupId(r: Row): String = { + + if (!r.isNullAt(r.fieldIndex(ColNames.Sector)) && getSector(r) == 1) { + DcirNgapAct.groupID.PublicAmbulatory + } + else { + if (r.isNullAt(r.fieldIndex(ColNames.GHSCode))) { + DcirNgapAct.groupID.Liberal + } else { + // Value is not at null, it is not liberal + lazy val ghs = getGHS(r) + lazy val institutionCode = getInstitutionCode(r) + // Check if it is a private ambulatory + if (ghs == 0 && PrivateInstitutionCodes.contains(institutionCode)) { + DcirNgapAct.groupID.PrivateAmbulatory + } + else { + DcirNgapAct.groupID.Unknown + } + } + } + } + + private def getGHS(r: Row): Double = r.getAs[Double](ColNames.GHSCode) + + private def getInstitutionCode(r: Row): Double = r.getAs[Double](ColNames.InstitutionCode) + + private def getSector(r: Row): Double = r.getAs[Double](ColNames.Sector) + + override def getCodes: NgapActConfig[NgapWithNatClassConfig] = ngapActsConfig +} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/ngapacts/McoCeNgapActExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/ngapacts/McoCeNgapActExtractor.scala new file mode 100644 index 00000000..268a12a4 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/ngapacts/McoCeNgapActExtractor.scala @@ -0,0 +1,78 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.ngapacts + +import scala.util.Try +import org.apache.spark.sql.functions.col +import org.apache.spark.sql.{DataFrame, Row} +import fr.polytechnique.cmap.cnam.etl.events._ +import fr.polytechnique.cmap.cnam.etl.extractors.Extractor +import fr.polytechnique.cmap.cnam.etl.extractors.sources.mcoce.McoCeRowExtractor +import fr.polytechnique.cmap.cnam.etl.sources.Sources + +sealed abstract class McoCeNgapActExtractor(ngapActsConfig: NgapActConfig[NgapActClassConfig]) extends Extractor[NgapAct, NgapActConfig[NgapActClassConfig]] + with McoCeRowExtractor { + // abstract values for implementing classes + val keyLetterColumn: String + val coeffColumn: String + val eventBuilder: EventBuilder + + // Implementation of the EventRowExtractor + override def usedColumns: List[String] = super.usedColumns ++ List(keyLetterColumn, coeffColumn) + + override def extractValue(row: Row): String = { + val letter = getNgapLetter(row) + val coeff = getNgapCoeff(row) + s"PmsiCe_${letter}_${coeff}" + } + + // Implementation of the Extractor Trait + override def getCodes: NgapActConfig[NgapActClassConfig] = ngapActsConfig + + override def isInExtractorScope(row: Row): Boolean = !row.isNullAt(row.fieldIndex(keyLetterColumn)) + + override def isInStudy(row: Row): Boolean = { + lazy val letter = getNgapLetter(row) + lazy val coeff = getNgapCoeff(row) + ngapActsConfig.actsCategories.exists(category => ngapIsInCategory(category, letter, coeff)) + } + + private def ngapIsInCategory(category: NgapActClassConfig, ngapLetter: => String, ngapCoeff: => String): Boolean = + category.ngapKeyLetters.contains(ngapLetter) && { + category.ngapCoefficients.isEmpty || category.ngapCoefficients.contains(ngapCoeff) + } + + private def getNgapLetter(row: Row): String = row.getAs[String](keyLetterColumn) + private def getNgapCoeff(row: Row): String = { + Try(row.getAs[Double](coeffColumn).toString) recover { + case _: NullPointerException => "0" + } + }.get + + + def builder(row: Row): Seq[Event[NgapAct]] = { + val patientId = extractPatientId(row) + val groupId = extractGroupId(row) + val value = extractValue(row) + val eventDate = extractStart(row) + val endDate = extractEnd(row) + val weight = extractWeight(row) + + Seq(eventBuilder[NgapAct](patientId, groupId, value, weight, eventDate, endDate)) + } + + override def getInput(sources: Sources): DataFrame = sources.mcoCe.get.select(usedColumns.map(col): _*) +} + +final case class McoCeFbstcNgapActExtractor(ngapConfig: NgapActConfig[NgapActClassConfig]) extends McoCeNgapActExtractor(ngapConfig) { + val keyLetterColumn: String = ColNames.NgapKeyLetterFbstc + val coeffColumn: String = ColNames.NgapCoefficientFbstc + override val eventBuilder: EventBuilder = McoCeFbstcNgapAct +} + +final case class McoCeFcstcNgapActExtractor(ngapConfig: NgapActConfig[NgapActClassConfig]) extends McoCeNgapActExtractor(ngapConfig) { + val keyLetterColumn: String = ColNames.NgapKeyLetterFcstc + val coeffColumn: String = ColNames.NgapCoefficientFcstc + + override val eventBuilder: EventBuilder = McoCeFcstcNgapAct +} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/ngapacts/NgapActClassConfig.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/ngapacts/NgapActClassConfig.scala new file mode 100644 index 00000000..bd585b12 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/ngapacts/NgapActClassConfig.scala @@ -0,0 +1,20 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.ngapacts + +//ngapCoefficients should always be specified with the dot separation for float, as this is how they are coded in the snds. +// eg: "2.0" should be used instead of "2" +class NgapActClassConfig( + val ngapKeyLetters: Seq[String], + val ngapCoefficients: Seq[String]) extends Serializable + +object NgapActClassConfig { + def apply(ngapKeyLetters: Seq[String], ngapCoefficients: Seq[String]): NgapActClassConfig = + new NgapActClassConfig(ngapKeyLetters,ngapCoefficients) +} + +// If your Extractor add the Information from IR_NAT_V reference table use this. +class NgapWithNatClassConfig( + override val ngapKeyLetters: Seq[String], + override val ngapCoefficients: Seq[String], + val ngapPrsNatRefs: Seq[String]) extends NgapActClassConfig(ngapKeyLetters, ngapCoefficients) diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/ngapacts/NgapActConfig.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/ngapacts/NgapActConfig.scala new file mode 100644 index 00000000..d93f3fe4 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/ngapacts/NgapActConfig.scala @@ -0,0 +1,36 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.ngapacts + +import fr.polytechnique.cmap.cnam.etl.extractors.ExtractorConfig +import fr.polytechnique.cmap.cnam.etl.extractors.codes.ExtractorCodes + +/** + * NgapActConfig defines three different ways to filter for specific ngap acts in the SNDS : + * The base configuration is NgapActClassConfig which can filters on : + * - prestation type (ngapPrsNatRefs: PRS_NAT_REF), + * - prestation coefficient (ngapKeyLetters : PRS_NAT_CB2 or ACT_COD in the PMSI_CE), + * - prestation coefficient (ngapCoefficients: PRS_ACT_CFT or ACT_COE in the PMSI_CE) + * **Note**: If acts_categories is empty, all ngap acts are extracted. + * The Ngap acts can be found in two sources. The filtering logic differs depending on the source. + * + * In the Dcir, search where ngapKeyLetter is available (ie. TODO what proportion in echantillon 2008-2016): + * - If a list of ngapPrsNatRefs is given, it extracts all of these PrsNatRef + * - if a list of ngapKeyLetters and a list of ngapCoefficients is given, it extracts all combination of (keyLetter, coefficient) + * + * In the Pmsi (only McoCe implemented, less than 12000 ngap acts per year in SSR_CE), + * search where ngapCoefficient is available + * - if a list of ngapKeyLetters and a list of ngapCoefficients is given, it extracts all combination of (keyLetter, coefficient) + * - if the list of ngapCoefficients is empty, extract all acts where coeff is in ngapCoefficient + * + * @param actsCategories List of configuration to get specific NgapActs + */ +class NgapActConfig[+C <: NgapActClassConfig]( + val actsCategories: List[C]) extends ExtractorConfig with ExtractorCodes { + override def isEmpty: Boolean = actsCategories.isEmpty +} + +object NgapActConfig { + + def apply[C <: NgapActClassConfig](actsCategories: List[C]): NgapActConfig[C] = new NgapActConfig[C](actsCategories) +} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/prestations/McoCeSpecialtyExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/prestations/McoCeSpecialtyExtractor.scala new file mode 100644 index 00000000..f445a4bb --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/prestations/McoCeSpecialtyExtractor.scala @@ -0,0 +1,37 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.prestations + +import org.apache.spark.sql.Row +import fr.polytechnique.cmap.cnam.etl.events.{EventBuilder, McoCeFbstcMedicalPractitionerClaim, McoCeFcstcMedicalPractitionerClaim, PractitionerClaimSpeciality} +import fr.polytechnique.cmap.cnam.etl.extractors.IsInStrategy +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.mcoce.McoCeSimpleExtractor + +/** + * Get specialties of the non medical practitioners in the MCO_CE: + * If a specialty is available, it extracts the specialty using MCO_FBSTC_ _EXE_SPE and MCO_FCSTC_ _EXE_SPE. + * These two columns are complementary as described here : + * https://documentation-snds.health-data-hub.fr/fiches/actes_consult_externes.html#les-tables-du-pmsi-version-snds-pour-les-ace + **/ +sealed abstract class McoCeSpecialtyExtractor(codes: SimpleExtractorCodes) extends McoCeSimpleExtractor[PractitionerClaimSpeciality] + with IsInStrategy[PractitionerClaimSpeciality] { + override def extractValue(row: Row): String = row.getAs[Int](columnName).toString + + override def isInExtractorScope(row: Row): Boolean = { + (!row.isNullAt(row.fieldIndex(columnName))) & (row.getAs[Integer](columnName) != 0) + } + + override def getCodes: SimpleExtractorCodes = codes +} + +final case class McoCeFbstcSpecialtyExtractor(codes: SimpleExtractorCodes) extends McoCeSpecialtyExtractor(codes) { + override val columnName: String = ColNames.PractitionnerSpecialtyFbstc + override val eventBuilder: EventBuilder = McoCeFbstcMedicalPractitionerClaim +} + + +final case class McoCeFcstcSpecialtyExtractor(codes: SimpleExtractorCodes) extends McoCeSpecialtyExtractor(codes) { + override val columnName: String = ColNames.PractitionnerSpecialtyFcstc + override val eventBuilder: EventBuilder = McoCeFcstcMedicalPractitionerClaim +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/prestations/PractitionerClaimSpecialityConfig.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/prestations/PractitionerClaimSpecialityConfig.scala similarity index 92% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/prestations/PractitionerClaimSpecialityConfig.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/prestations/PractitionerClaimSpecialityConfig.scala index 56f4c461..0eb362f1 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/prestations/PractitionerClaimSpecialityConfig.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/prestations/PractitionerClaimSpecialityConfig.scala @@ -1,6 +1,6 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.prestations +package fr.polytechnique.cmap.cnam.etl.extractors.events.prestations import fr.polytechnique.cmap.cnam.etl.extractors.ExtractorConfig diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/prestations/PractitionerClaimSpecialityExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/prestations/PractitionerClaimSpecialityExtractor.scala new file mode 100644 index 00000000..c4bd7f6e --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/prestations/PractitionerClaimSpecialityExtractor.scala @@ -0,0 +1,58 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.prestations + +import java.sql.Timestamp +import scala.util.Try +import org.apache.spark.sql.Row +import fr.polytechnique.cmap.cnam.etl.events._ +import fr.polytechnique.cmap.cnam.etl.extractors.IsInStrategy +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.dcir.DcirSimpleExtractor + +sealed abstract class DcirPractitionerSpecialityExtractor(codes: SimpleExtractorCodes) + extends DcirSimpleExtractor[PractitionerClaimSpeciality] with IsInStrategy[PractitionerClaimSpeciality] { + + override def usedColumns: List[ColName] = ColNames.ExecPSNum :: super.usedColumns + + override def extractStart(r: Row): Timestamp = { + Try(super.extractStart(r)) recover { + case _: NullPointerException => extractFluxDate(r) + } + }.get + + override def extractGroupId(r: Row): String = { + r.getAs[String](ColNames.ExecPSNum) + } + + override def extractValue(row: Row): String = row.getAs[Integer](columnName).toString + + override def isInExtractorScope(row: Row): Boolean = { + (!row.isNullAt(row.fieldIndex(columnName))) & (row.getAs[Integer](columnName) != 0) + } + + override def getCodes: SimpleExtractorCodes = codes +} + +/** + * Get specialties of medical practitioners in the Dcir: + * If a specialty is available, it extracts the specialty using PSE_SPE_COD and the practitioner + * identifier from the database. + */ +final case class MedicalPractitionerClaimExtractor(codes: SimpleExtractorCodes) + extends DcirPractitionerSpecialityExtractor(codes) { + override val columnName: String = ColNames.MSpe + override val eventBuilder: EventBuilder = MedicalPractitionerClaim +} + + +/** + * Get specialties of the non medical practitioners in the Dcir: + * If a specialty is available, it extracts the specialty using PSE_ACT_NAT and the practitioner + * identifier from the database. + */ +final case class NonMedicalPractitionerClaimExtractor(codes: SimpleExtractorCodes) + extends DcirPractitionerSpecialityExtractor(codes) { + override val columnName: String = ColNames.NonMSpe + override val eventBuilder: EventBuilder = NonMedicalPractitionerClaim +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/takeoverreasons/HadTakeOverReasonExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/takeoverreasons/HadTakeOverReasonExtractor.scala new file mode 100644 index 00000000..3ffd868d --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/takeoverreasons/HadTakeOverReasonExtractor.scala @@ -0,0 +1,30 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.takeoverreasons + +import org.apache.spark.sql.Row +import fr.polytechnique.cmap.cnam.etl.events.{EventBuilder, HadAssociatedTakeOver, HadMainTakeOver, MedicalTakeOverReason} +import fr.polytechnique.cmap.cnam.etl.extractors.IsInStrategy +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.had.HadSimpleExtractor + +final case class HadMainTakeOverExtractor(codes: SimpleExtractorCodes) extends HadSimpleExtractor[MedicalTakeOverReason] + with IsInStrategy[MedicalTakeOverReason] { + + override val columnName: String = ColNames.PEC_PAL + override val eventBuilder: EventBuilder = HadMainTakeOver + + override def extractValue(row: Row): String = row.getAs[Int](columnName).toString + + override def getCodes: SimpleExtractorCodes = codes +} + +final case class HadAssociatedTakeOverExtractor(codes: SimpleExtractorCodes) extends HadSimpleExtractor[MedicalTakeOverReason] + with IsInStrategy[MedicalTakeOverReason] { + + override val columnName: String = ColNames.PEC_ASS + override val eventBuilder: EventBuilder = HadAssociatedTakeOver + override def extractValue(row: Row): String = row.getAs[Int](columnName).toString + + override def getCodes: SimpleExtractorCodes = codes +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/had/HadExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/had/HadExtractor.scala deleted file mode 100644 index cd25b8f2..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/had/HadExtractor.scala +++ /dev/null @@ -1,47 +0,0 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.had - -import java.sql.Timestamp - -import fr.polytechnique.cmap.cnam.etl.events.{AnyEvent, Event, EventBuilder} -import fr.polytechnique.cmap.cnam.etl.extractors.{EventRowExtractor, Extractor} -import fr.polytechnique.cmap.cnam.etl.sources.Sources -import org.apache.spark.sql.functions.col -import org.apache.spark.sql.{DataFrame, Row} - -trait HadExtractor[EventType <: AnyEvent] extends Extractor[EventType] with HadSource with EventRowExtractor { - - val columnName: String - - val eventBuilder: EventBuilder - - def getInput(sources: Sources): DataFrame = sources.had.get.select(ColNames.all.map(col): _*).estimateStayStartTime - - def isInStudy(codes: Set[String]) - (row: Row): Boolean = codes.exists(code(row).startsWith(_)) - - def isInExtractorScope(row: Row): Boolean = !row.isNullAt(row.fieldIndex(columnName)) - - def builder(row: Row): Seq[Event[EventType]] = { - lazy val patientId = extractPatientId(row) - lazy val groupId = extractGroupId(row) - lazy val eventDate = extractStart(row) - lazy val endDate = extractEnd(row) - lazy val weight = extractWeight(row) - - Seq(eventBuilder[EventType](patientId, groupId, code(row), weight, eventDate, endDate)) - } - - def code: Row => String = (row: Row) => row.getAs[Int](columnName).toString - - def extractPatientId(r: Row): String = { - r.getAs[String](ColNames.PatientID) - } - - override def extractGroupId(r: Row): String = { - r.getAs[String](ColNames.EtaNumEpmsi) + "_" + - r.getAs[String](ColNames.RhadNum) + "_" + - r.getAs[Int](NewColumns.Year).toString - } - - def extractStart(r: Row): Timestamp = r.getAs[Timestamp](NewColumns.EstimatedStayStart) -} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/hospitalstays/HadHospitalStaysExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/hospitalstays/HadHospitalStaysExtractor.scala deleted file mode 100644 index c57f2d85..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/hospitalstays/HadHospitalStaysExtractor.scala +++ /dev/null @@ -1,30 +0,0 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.hospitalstays - -import java.sql.{Date, Timestamp} - -import fr.polytechnique.cmap.cnam.etl.events.{EventBuilder, HospitalStay, HadHospitalStay} -import fr.polytechnique.cmap.cnam.etl.extractors.had.HadExtractor -import fr.polytechnique.cmap.cnam.etl.sources.Sources -import org.apache.spark.sql.functions.col -import org.apache.spark.sql.{DataFrame, Row} - -object HadHospitalStaysExtractor extends HadExtractor[HospitalStay] { - override val columnName: String = ColNames.EndDate - override val eventBuilder: EventBuilder = HadHospitalStay - - override def extractEnd(r: Row): Option[Timestamp] = Some(new Timestamp(r.getAs[Date](ColNames.EndDate).getTime)) - - override def extractStart(r: Row): Timestamp = new Timestamp(r.getAs[Date](ColNames.StartDate).getTime) - - override def isInStudy(codes: Set[String])(row: Row): Boolean = true - - override def code: Row => String = extractGroupId - - override def getInput(sources: Sources): DataFrame = sources.had.get.select(ColNames.hospitalStayPart.map(col): _*).estimateStayStartTime - - override def extractGroupId(r: Row): String = { - r.getAs[String](ColNames.EtaNumEpmsi) + "_" + - r.getAs[String](ColNames.RhadNum) + "_" + - r.getAs[Int](NewColumns.Year).toString - } -} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/hospitalstays/McoHospitalStaysExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/hospitalstays/McoHospitalStaysExtractor.scala deleted file mode 100644 index 468f528c..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/hospitalstays/McoHospitalStaysExtractor.scala +++ /dev/null @@ -1,23 +0,0 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.hospitalstays - -import java.sql.{Date, Timestamp} -import org.apache.spark.sql.functions.col -import org.apache.spark.sql.{DataFrame, Row} -import fr.polytechnique.cmap.cnam.etl.events.{EventBuilder, HospitalStay, McoHospitalStay} -import fr.polytechnique.cmap.cnam.etl.extractors.mco.McoExtractor -import fr.polytechnique.cmap.cnam.etl.sources.Sources - -object McoHospitalStaysExtractor extends McoExtractor[HospitalStay] { - override val columnName: String = ColNames.EndDate - override val eventBuilder: EventBuilder = McoHospitalStay - - override def extractEnd(r: Row): Option[Timestamp] = Some(new Timestamp(r.getAs[Date](ColNames.EndDate).getTime)) - - override def extractStart(r: Row): Timestamp = new Timestamp(r.getAs[Date](ColNames.StartDate).getTime) - - override def isInStudy(codes: Set[String])(row: Row): Boolean = true - - override def code: Row => String = extractGroupId - - override def getInput(sources: Sources): DataFrame = sources.mco.get.select(ColNames.hospitalStayPart.map(col): _*) -} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/mco/McoExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/mco/McoExtractor.scala deleted file mode 100644 index e50ce46d..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/mco/McoExtractor.scala +++ /dev/null @@ -1,50 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors.mco - -import java.sql.Timestamp -import org.apache.spark.sql.functions.col -import org.apache.spark.sql.{DataFrame, Row} -import fr.polytechnique.cmap.cnam.etl.events.{AnyEvent, Event, EventBuilder} -import fr.polytechnique.cmap.cnam.etl.extractors.{EventRowExtractor, Extractor} -import fr.polytechnique.cmap.cnam.etl.sources.Sources - -trait McoExtractor[EventType <: AnyEvent] extends Extractor[EventType] with McoSource with EventRowExtractor { - - val columnName: String - - val eventBuilder: EventBuilder - - def getInput(sources: Sources): DataFrame = sources.mco.get.select(ColNames.all.map(col): _*).estimateStayStartTime - - def isInStudy(codes: Set[String]) - (row: Row): Boolean = codes.exists(code(row).startsWith(_)) - - def isInExtractorScope(row: Row): Boolean = !row.isNullAt(row.fieldIndex(columnName)) - - def builder(row: Row): Seq[Event[EventType]] = { - lazy val patientId = extractPatientId(row) - lazy val groupId = extractGroupId(row) - lazy val eventDate = extractStart(row) - lazy val endDate = extractEnd(row) - lazy val weight = extractWeight(row) - - Seq(eventBuilder[EventType](patientId, groupId, code(row), weight, eventDate, endDate)) - } - - def code = (row: Row) => row.getAs[String](columnName) - - def extractPatientId(r: Row): String = { - r.getAs[String](ColNames.PatientID) - } - - override def extractGroupId(r: Row): String = { - r.getAs[String](ColNames.EtaNum) + "_" + - r.getAs[String](ColNames.RsaNum) + "_" + - r.getAs[Int](ColNames.Year).toString - } - - def extractStart(r: Row): Timestamp = r.getAs[Timestamp](NewColumns.EstimatedStayStart) - - def getExit(r: Row): String = r.getAs[String](ColNames.ExitMode) -} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/AllPatientExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/AllPatientExtractor.scala new file mode 100644 index 00000000..3c39056d --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/AllPatientExtractor.scala @@ -0,0 +1,89 @@ +package fr.polytechnique.cmap.cnam.etl.extractors.patients + +import org.apache.spark.sql.functions.{coalesce, col, when, year} +import org.apache.spark.sql.{Column, DataFrame, Dataset} +import fr.polytechnique.cmap.cnam.etl.patients.Patient +import fr.polytechnique.cmap.cnam.etl.sources.Sources + +object AllPatientExtractor { + + def extract(sources: Sources): Dataset[Patient] = { + + val irBenPatients: Dataset[Patient] = IrBenPatients.extract(sources).as("irBen") + val dcirPatients: Dataset[Patient] = DcirPatients.extract(sources).as("dcir") + val mcoPatients: Dataset[Patient] = McoPatients.extract(sources).as("mco") + + val joinColumn: Column = coalesce(col("irBen.patientID"), col("mco.patientID")) + + val patients: DataFrame = irBenPatients + .join(mcoPatients, col("irBen.patientID") === col("mco.patientID"), "outer") + .join(dcirPatients, joinColumn === col("dcir.patientID"), "outer") + + val patientID: Column = coalesce( + col("dcir.patientID"), + col("irBen.patientID"), + col("mco.patientID") + ) + + val gender: Column = coalesce( + col("irBen.gender"), + col("dcir.gender") + ) + + val birthDate: Column = coalesce( + col("irBen.birthDate"), + col("dcir.birthDate") + ) + + val deathDate: Column = coalesce( + when( + validateDeathDate(col("irBen.deathDate"), birthDate), + col("irBen.deathDate") + ), + when( + validateDeathDate(col("dcir.deathDate"), birthDate), + col("dcir.deathDate") + ), + when( + validateDeathDate(col("mco.deathDate"), birthDate), + col("mco.deathDate") + )) + + import patients.sparkSession.implicits._ + + val birthYearErrors = List(-1, 0, 1, 1600) + + val filteredPatients = patients.where(birthDate.isNotNull && !year(birthDate).isin(birthYearErrors: _*)).select( + patientID.as("patientID"), + gender.as("gender"), + birthDate.as("birthDate"), + deathDate.as("deathDate") + ).as[Patient] + + sources.mcoCe match { + case None => filteredPatients.as[Patient] + case Some(_) => + val mcocePatients: Dataset[Patient] = McocePatients.extract(sources).as("mco_ce") + + val allPatients = filteredPatients.as("patients") + .join(mcocePatients, col("patients.patientID") === col("mco_ce.patientID"), "full") + + val idCol = coalesce(col("patients.patientID"), col("mco_ce.patientID")) + .alias("patientID") + val genderCol = coalesce(col("patients.gender"), col("mco_ce.gender")) + .alias("gender") + val birthDateCol = coalesce(col("patients.birthDate"), col("mco_ce.birthDate")) + .alias("birthDate") + val deathDateCol = coalesce(col("patients.deathDate"), col("mco_ce.deathDate")) + .alias("deathDate") + + allPatients + .select(idCol, genderCol, birthDateCol, deathDateCol) + .filter(col("patientID").isNotNull && col("gender").isNotNull && col("birthDate").isNotNull) + .as[Patient] + } + } + + def validateDeathDate(deathDate: Column, birthDate: Column): Column = + deathDate >= birthDate +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/DcirPatients.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/DcirPatients.scala index 629ce0fc..b6502eac 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/DcirPatients.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/DcirPatients.scala @@ -2,104 +2,176 @@ package fr.polytechnique.cmap.cnam.etl.extractors.patients +import java.sql.Timestamp import org.apache.spark.sql.expressions.Window import org.apache.spark.sql.functions._ import org.apache.spark.sql.types._ -import org.apache.spark.sql.{Column, DataFrame, Dataset} +import org.apache.spark.sql.{Column, Dataset} import fr.polytechnique.cmap.cnam.etl.extractors.patients.PatientUtils._ -import fr.polytechnique.cmap.cnam.etl.patients.Patient +import fr.polytechnique.cmap.cnam.etl.sources.Sources -private[patients] object DcirPatients { +case class PatientDcir(patientID: String, gender: Int, age: Int, birthYear: String, birthDate: Timestamp, eventDate: Timestamp, deathDate: Option[Timestamp]) + extends DerivedPatient - implicit class DcirPatientsDataFrame(data: DataFrame) { +private[patients] object DcirPatients extends PatientExtractor[PatientDcir] { - // The birth year for each patient is found by grouping by patientId and birthYear and then - // by taking the most frequent birth year for each patient. - def findBirthYears: DataFrame = { - val window = Window.partitionBy(col("patientID")).orderBy(col("count").desc, col("birthYear")) - data - .groupBy(col("patientID"), col("birthYear")).agg(count("*").as("count")) - // "first" is only deterministic when applied over an ordered window: - .select(col("patientID"), first(col("birthYear")).over(window).as("birthYear")) - .distinct - } + /** Find birth date of patients. + * + * @param patients that contains all patients. + * @return A [[Dataset]] of patients with birth date. + */ + override def findPatientBirthDate(patients: Dataset[PatientDcir]): Dataset[PatientDcir] = { + + val window = Window.partitionBy(col("patientID")).orderBy(col("count").desc, col("birthYear")) + val birthYear = patients + .groupBy(col("patientID"), col("birthYear")).agg(count("*").as("count")) + // "first" is only deterministic when applied over an ordered window: + .select(col("patientID"), first(col("birthYear")).over(window).as("birthYear")) + .distinct // After selecting the data, the next step is to group by patientId and age, because we need to // estimate the birthDate ant we use min(eventDate) and max(eventDate) for each age to achieve // that. - def groupByIdAndAge: DataFrame = { - data - .groupBy(col("patientID"), col("age")) - .agg( - count("gender").as("genderCount"), // We will use it to find the appropriate gender (avg) - sum("gender").as("genderSum"), // We will use it to find the appropriate gender (avg) - min("eventDate").as("minEventDate"), // the min event date for each age of a patient - max("eventDate").as("maxEventDate"), // the max event date for each age of a patient - min("deathDate").as("deathDate") // the earliest death date - ) - } + val minmaxevent = patients + .groupBy(col("patientID"), col("age")) + .agg( + min("eventDate").as("minEventDate"), // the min event date for each age of a patient + max("eventDate").as("maxEventDate") // the max event date for each age of a patient + ) // Then we aggregate again by taking the mean between the closest dates where the age changed. // For example, if the patient was 60yo when an event happened on Apr/2010 and he was 61yo when // another event happened on Jun/2010, we calculate the mean and estimate his birthday as - // being in May of the year found in "findBirthYears" - def estimateFields: DataFrame = { - val birthDateAggCol: Column = estimateBirthDateCol( - max(col("minEventDate")).cast(TimestampType), - min(col("maxEventDate")).cast(TimestampType), - first(col("birthYear")) - ) + // being in May of the year found + val birthDateAggCol: Column = estimateBirthDateCol( + max(col("minEventDate")).cast(TimestampType), + min(col("maxEventDate")).cast(TimestampType), + first(col("birthYear")) + ) + + val birthDate = minmaxevent.join(birthYear, "patientID") + .groupBy(col("patientID")) + .agg( + birthDateAggCol.as("birthDate")) + + import patients.sparkSession.implicits._ - data - .groupBy(col("patientID")) - .agg( - // Here we calculate the average of gender values and then we round. So, if 1 is more - // common, the average will be less than 1.5 and the final value will be 1. The same is - // valid for the case where 2 is more common. This is the reason why we set invalid - // values for gender to null. - round(sum(col("genderSum")) / sum(col("genderCount"))).cast(IntegerType).as("gender"), - birthDateAggCol.as("birthDate"), - min(col("deathDate")).cast(TimestampType).as("deathDate") - ) - } + val result = patients.as("patients") + .joinWith(birthYear.as("birthYearDf"), col("patients.patientID").equalTo(col("birthYearDf.patientID")), "left") + + result + .joinWith(birthDate, result("_1.patientID") === birthDate("patientID"), "left") + .map(p => + PatientDcir( + p._1._1.patientID, + p._1._1.gender, + p._1._1.age, + p._1._2.getAs("birthYear"), + p._2.getAs("birthDate"), + p._1._1.eventDate, + p._1._1.deathDate + )) } - def extract( - dcir: DataFrame, - minGender: Int, - maxGender: Int, - minYear: Int, - maxYear: Int): Dataset[Patient] = { + /** Find gender of patients. + * + * @param patients that contains all patients. + * @return A [[Dataset]] of patients with gender. + */ + override def findPatientGender(patients: Dataset[PatientDcir]): Dataset[PatientDcir] = { + + import patients.sparkSession.implicits._ + + val genderCodeError = 9 + + val gendercount = patients + .filter(_.gender != genderCodeError) + .groupByKey(p => (p.patientID, p.age)) + .count() + .map(p => (p._1._1, p._2.toInt)) + + val gendersum = patients + .filter(_.gender != genderCodeError) + .map(p => ((p.patientID, p.age), p.gender)) + .groupByKey(_._1) + .mapValues(row => row._2) + .reduceGroups((acc, str) => acc + str) + .map(p => (p._1, p._2)) - val genderCol: Column = when( - col("BEN_SEX_COD").between(minGender, maxGender), - col("BEN_SEX_COD") - ).cast(IntegerType) + val sumgendercount = gendercount + .groupByKey(p => p._1) + .mapValues(row => row._2) + .reduceGroups((acc, str) => acc + str) + .map(p => (p._1, p._2)) - val deathDateCol: Column = when( - year(col("BEN_DCD_DTE")).between(minYear, maxYear), - col("BEN_DCD_DTE") - ).cast(DateType) + val sumgendersum = gendersum + .groupByKey(p => p._1._1) + .mapValues(row => row._2) + .reduceGroups((acc, str) => acc + str) + .map(p => (p._1, p._2)) + val result = patients.joinWith(sumgendersum, patients("patientID") === sumgendersum("_1"), "left") + + result.joinWith(sumgendercount, result("_1.patientID") === sumgendercount("_1"), "left") + .map(p => + PatientDcir( + p._1._1.patientID, + if(p._2 != null) Math.round(p._1._2._2.toFloat / p._2._2) else 0, + p._1._1.age, + p._1._1.birthYear, + p._1._1.birthDate, + p._1._1.eventDate, + p._1._1.deathDate + )) + } + + /** Find death date of patients. + * + * @param patients that contains all patients. + * @return A [[Dataset]] of patients with death date. + */ + override def findPatientDeathDate(patients: Dataset[PatientDcir]): Dataset[PatientDcir] = { + import patients.sparkSession.implicits._ + val mindeathdate = patients + .groupByKey(p => p.patientID) + .reduceGroups((p1, p2) => if ((p2.deathDate.isEmpty && p1.deathDate.isEmpty) || p2.deathDate.isEmpty || (p1.deathDate.isDefined && p1.deathDate.get.before(p2.deathDate.get))) p1 else p2) + .map(p => (p._2.patientID, p._2.deathDate)) + + val genderCodeError = 9 + + patients + .filter(_.gender != genderCodeError) + .joinWith(mindeathdate, patients("patientID") === mindeathdate("_1"), "left") + .map(p => + PatientDcir( + p._1.patientID, + p._1.gender, + p._1.age, + p._1.birthYear, + p._1.birthDate, + p._1.eventDate, + p._2._2)) + } + + /** Gets and prepares all the needed columns from the Sources. + * + * @param sources Source object [[Sources]] that contains all sources. + * @return A [[Dataset]] with needed columns. + */ + override def getInput(sources: Sources): Dataset[PatientDcir] = { val inputColumns: List[Column] = List( col("NUM_ENQ").cast(StringType).as("patientID"), - genderCol.as("gender"), + col("BEN_SEX_COD").cast(IntegerType).as("gender"), col("BEN_AMA_COD").cast(IntegerType).as("age"), col("BEN_NAI_ANN").cast(StringType).as("birthYear"), - col("EXE_SOI_DTD").cast(DateType).as("eventDate"), - deathDateCol.as("deathDate") + lit(null).cast(TimestampType).as("birthDate"), + col("EXE_SOI_DTD").cast(TimestampType).as("eventDate"), + col("BEN_DCD_DTE").cast(TimestampType).as("deathDate") ) - val persistedDcir = dcir.select(inputColumns: _*) - - val birthYears: DataFrame = persistedDcir.findBirthYears - + val dcir = sources.dcir.get import dcir.sqlContext.implicits._ - val result = persistedDcir - .groupByIdAndAge - .join(birthYears, "patientID") - .estimateFields - .as[Patient] - result + dcir.select(inputColumns: _*).as[PatientDcir] } + } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/HadPatients.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/HadPatients.scala index adf88d75..a25dade9 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/HadPatients.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/HadPatients.scala @@ -1,58 +1,77 @@ package fr.polytechnique.cmap.cnam.etl.extractors.patients -import fr.polytechnique.cmap.cnam.util.functions.computeDateUsingMonthYear +import java.sql.Timestamp import org.apache.spark.sql.functions._ -import org.apache.spark.sql.{Column, DataFrame} - -private[patients] object HadPatients { - - val inputColumns: List[Column] = List( - col("NUM_ENQ").as("patientID"), - col("HAD_B__SOR_MOD").as("SOR_MOD"), - col("HAD_B__SOR_MOI").as("SOR_MOI"), - col("HAD_B__SOR_ANN").as("SOR_ANN") - ) - - val outputColumns: List[Column] = List( - col("patientID"), - col("deathDate") - ) - - implicit class HadPatientsDataFrame(data: DataFrame) { - - def getDeathDates(deathCode: Int): DataFrame = { - val deathDates: DataFrame = data.filter(col("SOR_MOD") === deathCode) - .withColumn("deathDate", computeDateUsingMonthYear(col("SOR_MOI"), col("SOR_ANN"))) - - val result = deathDates - .groupBy("patientID") - .agg( - countDistinct(col("deathDate")).as("count"), - min(col("deathDate")).as("deathDate") - ).cache() - /* - val conflicts = result - .filter(col("count") > 1) - .select(col("patientID")) - .distinct - .collect - - if(conflicts.length != 0) - Logger.getLogger(getClass).warn("The patients in " + - conflicts.deep.mkString("\n") + - "\nhave conflicting DEATH DATES in HAD." + - "\nTaking Minimum Death Dates") - */ - result - } +import org.apache.spark.sql.types.{IntegerType, StringType, TimestampType} +import org.apache.spark.sql.{Column, Dataset} +import fr.polytechnique.cmap.cnam.etl.sources.Sources +import fr.polytechnique.cmap.cnam.util.functions.computeDateUsingMonthYear + +case class PatientHad(patientID: String, exitMode: Int, exitMonth: String, exitYear: String, gender: Int, birthDate: Timestamp, deathDate: Option[Timestamp]) + extends DerivedPatient + +private[patients] object HadPatients extends PatientExtractor[PatientHad] { + + /** Find birth date of patients. + * + * @param patients that contains all patients. + * @return A [[Dataset]] of patients with birth date. + */ + override def findPatientBirthDate(patients: Dataset[PatientHad]): Dataset[PatientHad] = { + patients + } + + /** Find gender of patients. + * + * @param patients that contains all patients. + * @return A [[Dataset]] of patients with gender. + */ + override def findPatientGender(patients: Dataset[PatientHad]): Dataset[PatientHad] = { + patients + } + + /** Find death date of patients. + * + * @param patients that contains all patients. + * @return A [[Dataset]] of patients with death date. + */ + override def findPatientDeathDate(patients: Dataset[PatientHad]): Dataset[PatientHad] = { + import patients.sparkSession.implicits._ + val deathCode = 9 + patients + .filter(_.exitMode == deathCode) + .groupByKey(p => p.patientID) + .reduceGroups((p1, p2) => if ((p2.deathDate.isEmpty && p1.deathDate.isEmpty) || p2.deathDate.isEmpty || (p1.deathDate.isDefined && p1.deathDate.get.before(p2.deathDate.get))) p1 else p2) + .map(p => + PatientHad( + p._2.patientID, + p._2.exitMode, + p._2.exitMonth, + p._2.exitYear, + p._2.gender, + p._2.birthDate, + p._2.deathDate + )) } - def extract(had: DataFrame, hadDeathCode: Int = 9): DataFrame = { + /** Gets and prepares all the needed columns from the Sources. + * + * @param sources Source object [[Sources]] that contains all sources. + * @return A [[Dataset]] with needed columns. + */ + override def getInput(sources: Sources): Dataset[PatientHad] = { + val inputColumns: List[Column] = List( + col("NUM_ENQ").cast(StringType).as("patientID"), + col("HAD_B__SOR_MOD").cast(IntegerType).as("exitMode"), + col("HAD_B__SOR_MOI").cast(StringType).as("exitMonth"), + col("HAD_B__SOR_ANN").cast(StringType).as("exitYear"), + lit(0).cast(IntegerType).as("gender"), + lit(null).cast(TimestampType).as("birthDate"), + computeDateUsingMonthYear(col("HAD_B__SOR_MOI"), col("HAD_B__SOR_ANN")).cast(TimestampType).as("deathDate") + ) - had - .select(inputColumns: _*) - .distinct - .getDeathDates(hadDeathCode) - .select(outputColumns: _*) + val had = sources.had.get + import had.sqlContext.implicits._ + had.select(inputColumns: _*).as[PatientHad] } } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/IrBenPatients.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/IrBenPatients.scala index b30c183f..c9ee1907 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/IrBenPatients.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/IrBenPatients.scala @@ -2,89 +2,77 @@ package fr.polytechnique.cmap.cnam.etl.extractors.patients +import java.sql.Timestamp import org.apache.spark.sql.functions._ -import org.apache.spark.sql.types.TimestampType -import org.apache.spark.sql.{Column, DataFrame, Dataset} -import fr.polytechnique.cmap.cnam.etl.patients.Patient +import org.apache.spark.sql.types.{IntegerType, StringType, TimestampType} +import org.apache.spark.sql.{Column, Dataset} +import fr.polytechnique.cmap.cnam.etl.sources.Sources import fr.polytechnique.cmap.cnam.util.functions.computeDateUsingMonthYear -private[patients] object IrBenPatients { +case class PatientIrBen(patientID: String, gender: Int, birthMonth: String, birthYear: String, birthDate: Timestamp, deathDate: Option[Timestamp]) + extends DerivedPatient - val inputColumns = List( - col("NUM_ENQ").as("patientID"), - col("BEN_SEX_COD"), - col("BEN_NAI_MOI"), - col("BEN_NAI_ANN"), - col("BEN_DCD_DTE") - ) +private[patients] object IrBenPatients extends PatientExtractor[PatientIrBen] { - val outputColumns = List( - col("patientID"), - col("gender"), - col("birthDate"), - col("deathDate") - ) - - implicit class IrBenPatientsDataFrame(data: DataFrame) { - - def getGender: DataFrame = { - val result = data - .select( - col("patientID"), - col("BEN_SEX_COD").cast("int").as("gender") - ).distinct - .cache - - val patients = result.select(col("patientID")).distinct() - - if (result.count != patients.count) { - throw new Exception("One or more patients have conflicting SEX CODE in IR_BEN_R") - } - - result - } - - def getDeathDate: DataFrame = { - data.filter(col("BEN_DCD_DTE").isNotNull) - .groupBy(col("patientID")) - .agg(min(col("BEN_DCD_DTE")).cast(TimestampType).as("deathDate")) - } - - def getBirthDate(minYear: Int = 1900, maxYear: Int = 2100): DataFrame = { - - val birthDate: Column = computeDateUsingMonthYear(col("BEN_NAI_MOI"), col("BEN_NAI_ANN")).as("birthDate") - - val result = data - .filter( - col("BEN_NAI_MOI").between(1, 12) && - col("BEN_NAI_ANN").between(minYear, maxYear) - ) - .select(col("patientID"), birthDate) - .distinct - .cache - val patients = result.select(col("patientID")).distinct - - // This check makes sure patients don't have conflicting birth dates. - if (result.count != patients.count) { - throw new Exception("One or more patients have conflicting BIRTH DATES in IR_BEN_R") - } - - result - } + /** Find birth date of patients. + * + * @param patients that contains all patients. + * @return A [[Dataset]] of patients with birth date. + */ + override def findPatientBirthDate(patients: Dataset[PatientIrBen]): Dataset[PatientIrBen] = { + patients } - def extract(irBen: DataFrame, minYear: Int, maxYear: Int): Dataset[Patient] = { - - val persistedIrBen = irBen.select(inputColumns: _*).persist() - import persistedIrBen.sqlContext.implicits._ + /** Find gender of patients. + * + * @param patients that contains all patients. + * @return A [[Dataset]] of patients with gender. + */ + override def findPatientGender(patients: Dataset[PatientIrBen]): Dataset[PatientIrBen] = { + patients + } - val birthDates = persistedIrBen.getBirthDate(minYear, maxYear) - val deathDates = persistedIrBen.getDeathDate + /** Find death date of patients. + * + * @param patients that contains all patients. + * @return A [[Dataset]] of patients with death date. + */ + override def findPatientDeathDate(patients: Dataset[PatientIrBen]): Dataset[PatientIrBen] = { + import patients.sparkSession.implicits._ + val mindeathdate = patients + .groupByKey(p => p.patientID) + .reduceGroups((p1, p2) => if ((p2.deathDate.isEmpty && p1.deathDate.isEmpty) || p2.deathDate.isEmpty || (p1.deathDate.isDefined && p1.deathDate.get.before(p2.deathDate.get))) p1 else p2) + .map(p => (p._2.patientID, p._2.deathDate)) + + patients.joinWith(mindeathdate, patients("patientID") === mindeathdate("_1"), "left") + .map(p => + PatientIrBen( + p._1.patientID, + p._1.gender, + p._1.birthMonth, + p._1.birthYear, + p._1.birthDate, + p._2._2 + )) + } - persistedIrBen.getGender - .join(deathDates, Seq("patientID"), "left_outer") - .join(birthDates, Seq("patientID"), "left_outer") - .select(outputColumns: _*) - .as[Patient] + /** Gets and prepares all the needed columns from the Sources. + * + * @param sources Source object [[Sources]] that contains all sources. + * @return A [[Dataset]] with needed columns. + */ + override def getInput(sources: Sources): Dataset[PatientIrBen] = { + val inputColumns: List[Column] = List( + col("NUM_ENQ").cast(StringType).as("patientID"), + col("BEN_SEX_COD").cast(IntegerType).as("gender"), + col("BEN_NAI_MOI").cast(StringType).as("birthMonth"), + col("BEN_NAI_ANN").cast(StringType).as("birthYear"), + computeDateUsingMonthYear(col("BEN_NAI_MOI"), col("BEN_NAI_ANN")).as("birthDate"), + col("BEN_DCD_DTE").cast(TimestampType).as("deathDate") + ) + + val irBen = sources.irBen.get + import irBen.sqlContext.implicits._ + irBen.select(inputColumns: _*).as[PatientIrBen] } } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/McoPatients.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/McoPatients.scala index 6b0bdc57..d8d5c3da 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/McoPatients.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/McoPatients.scala @@ -2,60 +2,78 @@ package fr.polytechnique.cmap.cnam.etl.extractors.patients -import org.apache.spark.sql.functions._ -import org.apache.spark.sql.{Column, DataFrame} +import java.sql.Timestamp +import org.apache.spark.sql.functions.{col, lit} +import org.apache.spark.sql.types.{IntegerType, StringType, TimestampType} +import org.apache.spark.sql.{Column, Dataset} +import fr.polytechnique.cmap.cnam.etl.sources.Sources import fr.polytechnique.cmap.cnam.util.functions.computeDateUsingMonthYear -private[patients] object McoPatients { - - val inputColumns: List[Column] = List( - col("NUM_ENQ").as("patientID"), - col("MCO_B__SOR_MOD").as("SOR_MOD"), - col("SOR_MOI"), - col("SOR_ANN") - ) - - val outputColumns: List[Column] = List( - col("patientID"), - col("deathDate") - ) - - implicit class McoPatientsDataFrame(data: DataFrame) { - - def getDeathDates(deathCode: Int): DataFrame = { - // TODO: We may need to check the consistency of {SOR_MOI, SOR_ANN} against SOR_DAT in MCO_C. - val deathDates: DataFrame = data.filter(col("SOR_MOD") === deathCode) - .withColumn("deathDate", computeDateUsingMonthYear(col("SOR_MOI"), col("SOR_ANN"))) - - val result = deathDates - .groupBy("patientID") - .agg( - countDistinct(col("deathDate")).as("count"), - min(col("deathDate")).as("deathDate") - ).cache() - /* - val conflicts = result - .filter(col("count") > 1) - .select(col("patientID")) - .distinct - .collect - - if(conflicts.length != 0) - Logger.getLogger(getClass).warn("The patients in " + - conflicts.deep.mkString("\n") + - "\nhave conflicting DEATH DATES in MCO." + - "\nTaking Minimum Death Dates") - */ - result - } +case class PatientMco(patientID: String, exitMode: Int, exitMonth: String, exitYear: String, gender: Int, birthDate: Timestamp, deathDate: Option[Timestamp]) + extends DerivedPatient + +private[patients] object McoPatients extends PatientExtractor[PatientMco] { + + /** Find birth date of patients. + * + * @param patients that contains all patients. + * @return A [[Dataset]] of patients with birth date. + */ + override def findPatientBirthDate(patients: Dataset[PatientMco]): Dataset[PatientMco] = { + patients + } + + /** Find gender of patients. + * + * @param patients that contains all patients. + * @return A [[Dataset]] of patients with gender. + */ + override def findPatientGender(patients: Dataset[PatientMco]): Dataset[PatientMco] = { + patients + } + + /** Find death date of patients. + * + * @param patients that contains all patients. + * @return A [[Dataset]] of patients with death date. + */ + override def findPatientDeathDate(patients: Dataset[PatientMco]): Dataset[PatientMco] = { + import patients.sparkSession.implicits._ + val deathCode = 9 + patients + .filter(_.exitMode == deathCode) + .groupByKey(p => p.patientID) + .reduceGroups((p1, p2) => if ((p2.deathDate.isEmpty && p1.deathDate.isEmpty) || p2.deathDate.isEmpty || (p1.deathDate.isDefined && p1.deathDate.get.before(p2.deathDate.get))) p1 else p2) + .map(p => + PatientMco( + p._2.patientID, + p._2.exitMode, + p._2.exitMonth, + p._2.exitYear, + p._2.gender, + p._2.birthDate, + p._2.deathDate + )) } - def extract(mco: DataFrame, mcoDeathCode: Int = 9): DataFrame = { + /** Gets and prepares all the needed columns from the Sources. + * + * @param sources Source object [[Sources]] that contains all sources. + * @return A [[Dataset]] with needed columns. + */ + override def getInput(sources: Sources): Dataset[PatientMco] = { + val inputColumns: List[Column] = List( + col("NUM_ENQ").cast(StringType).as("patientID"), + col("MCO_B__SOR_MOD").cast(IntegerType).as("exitMode"), + col("SOR_MOI").cast(StringType).as("exitMonth"), + col("SOR_ANN").cast(StringType).as("exitYear"), + lit(0).cast(IntegerType).as("gender"), + lit(null).cast(TimestampType).as("birthDate"), + computeDateUsingMonthYear(col("SOR_MOI"), col("SOR_ANN")).cast(TimestampType).as("deathDate") + ) - mco - .select(inputColumns: _*) - .distinct - .getDeathDates(mcoDeathCode) - .select(outputColumns: _*) + val mco = sources.mco.get + import mco.sqlContext.implicits._ + mco.select(inputColumns: _*).as[PatientMco] } } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/McocePatients.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/McocePatients.scala index 02963d06..03b1d24f 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/McocePatients.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/McocePatients.scala @@ -2,99 +2,147 @@ package fr.polytechnique.cmap.cnam.etl.extractors.patients +import java.sql.Timestamp import org.apache.spark.sql.expressions.Window import org.apache.spark.sql.functions._ -import org.apache.spark.sql.types.{DoubleType, IntegerType, TimestampType} -import org.apache.spark.sql.{DataFrame, Dataset} +import org.apache.spark.sql.types.{IntegerType, StringType, TimestampType} +import org.apache.spark.sql.{Column, Dataset} import fr.polytechnique.cmap.cnam.etl.extractors.patients.PatientUtils.estimateBirthDateCol -import fr.polytechnique.cmap.cnam.etl.patients.Patient - -private[patients] object McocePatients { - - implicit class McocePatientsImplicit(mce: DataFrame) { - - def calculateBirthYear: DataFrame = { - val win = Window.partitionBy("patientID") - - val birthYear = min(col("event_year") - col("age")) - .over(win) - .as("birth_year") - - mce.groupBy("patientID", "age") - .agg(max(year(col("event_date"))).as("event_year")) - .select(col("patientID"), birthYear) - .distinct - } - - def groupByIdAndAge: DataFrame = { - mce.groupBy("patientID", "age") - .agg( - sum("sex").cast(DoubleType).as("sum_sex"), - count("sex").cast(DoubleType).as("count_sex"), - min("event_date").as("min_event_date"), - max("event_date").as("max_event_date") - ) - } - - def calculateBirthDateAndGender: DataFrame = { - val genderCol = round(sum("sum_sex") / sum("count_sex")) - .cast(IntegerType) - .as("gender") - - val birthDateCol = estimateBirthDateCol( - max("min_event_date"), min("max_event_date"), - first("birth_year") - ).as("birthDate") - - mce.groupBy("patientID") - .agg( - genderCol, - birthDateCol - ) - } +import fr.polytechnique.cmap.cnam.etl.sources.Sources + +case class PatientMcoce(patientID: String, gender: Int, age: Int, birthDate: Timestamp, eventDate: Timestamp, deathDate: Option[Timestamp]) + extends DerivedPatient + + +private[patients] object McocePatients extends PatientExtractor[PatientMcoce] { + + /** Find birth date of patients. + * + * @param patients that contains all patients. + * @return A [[Dataset]] of patients with birth date. + */ + override def findPatientBirthDate(patients: Dataset[PatientMcoce]): Dataset[PatientMcoce] = { + + val window = Window.partitionBy(col("patientID")) + val birthYear = min(col("eventYear") - col("age")) + .over(window) + .as("birthYear") + + val patientsbirthYear = patients.groupBy("patientID", "age") + .agg(max(year(col("eventDate"))).as("eventYear")) + .select(col("patientID"), birthYear) + .distinct + + val minmaxevent = patients + .groupBy(col("patientID"), col("age")) + .agg( + min("eventDate").as("minEventDate"), // the min event date for each age of a patient + max("eventDate").as("maxEventDate") // the max event date for each age of a patient + ) + + val birthDateAggCol: Column = estimateBirthDateCol( + max(col("minEventDate")).cast(TimestampType), + min(col("maxEventDate")).cast(TimestampType), + first(col("birthYear")) + ) + + val birthDate = minmaxevent.join(patientsbirthYear, "patientID") + .groupBy(col("patientID")) + .agg( + birthDateAggCol.as("birthDate")) + + import patients.sparkSession.implicits._ + + patients.as(("patients")) + .joinWith(birthDate.as("birthDateDf"), col("patients.patientID").equalTo(col("birthDateDf.patientID")), "left") + .map(p => + PatientMcoce( + p._1.patientID, + p._1.gender, + p._1.age, + p._2.getAs("birthDate"), + p._1.eventDate, + p._1.deathDate + )) } - def extract( - mcoce: DataFrame, - minGender: Int, - maxGender: Int, - minYear: Int, - maxYear: Int): Dataset[Patient] = { + /** Find gender of patients. + * + * @param patients that contains all patients. + * @return A [[Dataset]] of patients with gender. + */ + override def findPatientGender(patients: Dataset[PatientMcoce]): Dataset[PatientMcoce] = { + + import patients.sparkSession.implicits._ + + val genderCodeError = 9 + + val gendercount = patients + .filter(_.gender != genderCodeError) + .groupByKey(p => (p.patientID, p.age)) + .count() + .map(p => (p._1._1, p._2.toInt)) + + val gendersum = patients + .filter(_.gender != genderCodeError) + .map(p => ((p.patientID, p.age), p.gender)) + .groupByKey(_._1) + .mapValues(row => row._2) + .reduceGroups((acc, str) => acc + str) + .map(p => (p._1, p._2)) + + val sumgendercount = gendercount + .groupByKey(p => p._1) + .mapValues(row => row._2) + .reduceGroups((acc, str) => acc + str) + .map(p => (p._1, p._2)) + + val sumgendersum = gendersum + .groupByKey(p => p._1._1) + .mapValues(row => row._2) + .reduceGroups((acc, str) => acc + str) + .map(p => (p._1, p._2)) + + val result = patients.joinWith(sumgendersum, patients("patientID") === sumgendersum("_1"), "left") + + result.joinWith(sumgendercount, result("_1.patientID") === sumgendercount("_1"), "left") + .map(p => + PatientMcoce( + p._1._1.patientID, + if (p._2 != null) Math.round(p._1._2._2.toFloat / p._2._2) else 0, + p._1._1.age, + p._1._1.birthDate, + p._1._1.eventDate, + p._1._1.deathDate + )) + } - val sexCol = when( - col("MCO_FASTC__COD_SEX").cast(IntegerType) - .between(minGender, maxGender), col("MCO_FASTC__COD_SEX") - ) - .cast(IntegerType) - .as("sex") + /** Find death date of patients. + * + * @param patients that contains all patients. + * @return A [[Dataset]] of patients with death date. + */ + override def findPatientDeathDate(patients: Dataset[PatientMcoce]): Dataset[PatientMcoce] = { + patients + } - val eventDateCol = when( - year(col("EXE_SOI_DTD")) - .between(minYear, maxYear), col("EXE_SOI_DTD") - ) - .cast(TimestampType) - .as("event_date") - - val ageCol = col("MCO_FASTC__AGE_ANN") - .cast(IntegerType) - .as("age") - - val inputCols = List( - col("NUM_ENQ").as("patientID"), - sexCol, - ageCol, - eventDateCol + /** Gets and prepares all the needed columns from the Sources. + * + * @param sources Source object [[Sources]] that contains all sources. + * @return A [[Dataset]] with needed columns. + */ + override def getInput(sources: Sources): Dataset[PatientMcoce] = { + val inputColumns: List[Column] = List( + col("NUM_ENQ").cast(StringType).as("patientID"), + col("MCO_FASTC__COD_SEX").cast(IntegerType).as("gender"), + col("MCO_FASTC__AGE_ANN").cast(IntegerType).as("age"), + lit(null).cast(TimestampType).as("birthDate"), + col("EXE_SOI_DTD").cast(TimestampType).as("eventDate"), + lit(null).cast(TimestampType).as("deathDate") ) - val mcoceFiltered = mcoce.select(inputCols: _*) - val birthYears = mcoceFiltered.calculateBirthYear - - import mcoce.sparkSession.implicits._ - mcoceFiltered.groupByIdAndAge - .join(birthYears, "patientID") - .calculateBirthDateAndGender - .withColumn("deathDate", lit(null).cast(TimestampType)) - .as[Patient] + val mcoce = sources.mcoCe.get + import mcoce.sqlContext.implicits._ + mcoce.select(inputColumns: _*).as[PatientMcoce] } - } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/PatientExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/PatientExtractor.scala new file mode 100644 index 00000000..89ba77bc --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/PatientExtractor.scala @@ -0,0 +1,80 @@ +package fr.polytechnique.cmap.cnam.etl.extractors.patients + +import java.sql.Timestamp +import org.apache.spark.sql.functions._ +import org.apache.spark.sql.{Column, Dataset} +import fr.polytechnique.cmap.cnam.etl.patients.Patient +import fr.polytechnique.cmap.cnam.etl.sources.Sources + +trait DerivedPatient { + val patientID: String + val gender: Int + val birthDate: Timestamp + val deathDate: Option[Timestamp] +} + +trait PatientExtractor[PatientType <: DerivedPatient] { + + /** Find birth date of patients. + * + * @param patients that contains all patients. + * @return A [[Dataset]] of patients with birth date. + */ + def findPatientBirthDate(patients: Dataset[PatientType]): Dataset[PatientType] + + /** Find gender of patients. + * + * @param patients that contains all patients. + * @return A [[Dataset]] of patients with gender. + */ + def findPatientGender(patients: Dataset[PatientType]): Dataset[PatientType] + + /** Find death date of patients. + * + * @param patients that contains all patients. + * @return A [[Dataset]] of patients with death date. + */ + def findPatientDeathDate(patients: Dataset[PatientType]): Dataset[PatientType] + + /** Transform patientBIS to patient. + * + * @param patients that contains all patientsBIS. + * @return A [[Dataset]] with needed columns of Patient. + */ + def fromDerivedPatienttoPatient(patients: Dataset[PatientType]): Dataset[Patient] = { + val outputColumns: List[Column] = List( + col("patientID"), + col("gender"), + col("birthDate"), + col("deathDate") + ) + + import patients.sqlContext.implicits._ + patients.select(outputColumns: _*).as[Patient] + } + + /** Gets and prepares all the needed columns from the Sources. + * + * @param sources Source object [[Sources]] that contains all sources. + * @return A [[Dataset]] with needed columns. + */ + def getInput(sources: Sources): Dataset[PatientType] + + /** Extracts the Patient from the Source. + * + * This function is responsible for gluing different other parts of the Extractor. + * This method should be considered the unique callable method from a Study perspective. + * + * @param sources Source object [[Sources]] that contains all sources. + * @return A Dataset of Patient of type EventType. + */ + def extract(sources: Sources): Dataset[Patient] = { + val input: Dataset[PatientType] = getInput(sources) + + input.transform(findPatientBirthDate) + .transform(findPatientGender) + .transform(findPatientDeathDate) + .transform(fromDerivedPatienttoPatient) + .distinct() + } +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/Patients.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/Patients.scala deleted file mode 100644 index 3f9f10a0..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/Patients.scala +++ /dev/null @@ -1,109 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors.patients - -import java.sql.Timestamp -import org.apache.spark.sql.functions._ -import org.apache.spark.sql.{Column, DataFrame, Dataset} -import fr.polytechnique.cmap.cnam.etl.patients._ -import fr.polytechnique.cmap.cnam.etl.sources.Sources -import fr.polytechnique.cmap.cnam.util.datetime.implicits._ -import fr.polytechnique.cmap.cnam.util.functions.makeTS - -class Patients(config: PatientsConfig) { - - import Patients.validateDeathDate - - def extract(sources: Sources): Dataset[Patient] = { - - val dcir = sources.dcir.get - val mco = sources.mco.get - val irBen = sources.irBen.get - - val mcoPatients: DataFrame = McoPatients.extract(mco, config.mcoDeathCode).toDF.as("mco") - - val irBenPatients: DataFrame = IrBenPatients.extract( - irBen, config.minYear, config.maxYear - ).toDF.as("irBen") - - val dcirPatients: DataFrame = DcirPatients.extract( - dcir, config.minGender, config.maxGender, config.minYear, config.maxYear - ).toDF.as("dcir") - - import dcirPatients.sqlContext.implicits._ - - val joinColumn: Column = coalesce(col("irBen.patientID"), col("mco.patientID")) - - val patients: DataFrame = irBenPatients - .join(mcoPatients, col("irBen.patientID") === col("mco.patientID"), "outer") - .join(dcirPatients, joinColumn === col("dcir.patientID"), "outer") - - val patientID: Column = coalesce( - col("dcir.patientID"), - col("irBen.patientID"), - col("mco.patientID") - ) - - val gender: Column = coalesce(col("irBen.gender"), col("dcir.gender")) - - val birthDate: Column = coalesce(col("irBen.birthDate"), col("dcir.birthDate")) - - val deathDate: Column = coalesce( - when( - validateDeathDate(col("irBen.deathDate"), birthDate, config.maxYear), - col("irBen.deathDate") - ), - when( - validateDeathDate(col("dcir.deathDate"), birthDate, config.maxYear), - col("dcir.deathDate") - ), - when( - validateDeathDate(col("mco.deathDate"), birthDate, config.maxYear), - col("mco.deathDate") - ) - ) - - val ageReferenceDate: Timestamp = config.ageReferenceDate - val age = floor(months_between(lit(ageReferenceDate), birthDate) / 12) - val filterPatientsByAge = age >= config.minAge && age < config.maxAge - - val filteredPatients = patients.where(filterPatientsByAge) - .select( - patientID.as("patientID"), - gender.as("gender"), - birthDate.as("birthDate"), - deathDate.as("deathDate") - ) - - sources.mcoCe match { - case None => filteredPatients.as[Patient] - case Some(mcoce) => - val mcocePatients = McocePatients - .extract(mcoce, config.minGender, config.maxGender, config.minYear, config.maxYear) - .toDF() - .as("mco_ce") - - val allPatients = filteredPatients.as("patients") - .join(mcocePatients, col("patients.patientID") === col("mco_ce.patientID"), "full") - - val idCol = coalesce(col("patients.patientID"), col("mco_ce.patientID")) - .alias("patientID") - val genderCol = coalesce(col("patients.gender"), col("mco_ce.gender")) - .alias("gender") - val birthDateCol = coalesce(col("patients.birthDate"), col("mco_ce.birthDate")) - .alias("birthDate") - val deathDateCol = coalesce(col("patients.deathDate"), col("mco_ce.deathDate")) - .alias("deathDate") - - allPatients.select(idCol, genderCol, birthDateCol, deathDateCol) - .filter(col("patientID").isNotNull && col("gender").isNotNull && col("birthDate").isNotNull) - .as[Patient] - } - } -} - -object Patients { - - def validateDeathDate(deathDate: Column, birthDate: Column, maxYear: Int): Column = - deathDate.between(birthDate, makeTS(maxYear, 1, 1)) -} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/PatientsConfig.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/PatientsConfig.scala index cdf39c84..97666bc3 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/PatientsConfig.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/PatientsConfig.scala @@ -4,6 +4,7 @@ package fr.polytechnique.cmap.cnam.etl.extractors.patients import java.sql.Timestamp import java.time.LocalDate + import fr.polytechnique.cmap.cnam.etl.extractors.ExtractorConfig class PatientsConfig( @@ -14,7 +15,7 @@ class PatientsConfig( val maxYear: Int = 2020, val minGender: Int = 1, val maxGender: Int = 2, - val mcoDeathCode: Int = 9) extends ExtractorConfig + val mcoDeathCode: Int = 9) extends ExtractorConfig with Serializable object PatientsConfig { diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/prestations/PractitionerClaimSpecialityExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/prestations/PractitionerClaimSpecialityExtractor.scala deleted file mode 100644 index 4bcdd724..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/prestations/PractitionerClaimSpecialityExtractor.scala +++ /dev/null @@ -1,43 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors.prestations - -import java.sql.Timestamp -import scala.util.Try -import org.apache.spark.sql.Row -import fr.polytechnique.cmap.cnam.etl.events._ -import fr.polytechnique.cmap.cnam.etl.extractors.dcir.DcirExtractor - -object MedicalPractitionerClaimExtractor extends DcirExtractor[PractitionerClaimSpeciality] { - override val columnName: String = ColNames.MSpe - override val eventBuilder: EventBuilder = MedicalPractitionerClaim - - override def code: Row => String = (row: Row) => row.getAs[Integer](columnName).toString - - override def extractStart(r: Row): Timestamp = { - Try(super.extractStart(r)) recover { - case _: NullPointerException => extractFluxDate(r) - } - }.get - - override def extractGroupId(r: Row): String = { - r.getAs[String](ColNames.ExecPSNum) - } -} - -object NonMedicalPractitionerClaimExtractor extends DcirExtractor[PractitionerClaimSpeciality] { - override val columnName: String = ColNames.NonMSpe - override val eventBuilder: EventBuilder = NonMedicalPractitionerClaim - - override def code: Row => String = (row: Row) => row.getAs[Integer](columnName).toString - - override def extractStart(r: Row): Timestamp = { - Try(super.extractStart(r)) recover { - case _: NullPointerException => extractFluxDate(r) - } - }.get - - override def extractGroupId(r: Row): String = { - r.getAs[String](ColNames.ExecPSNum) - } -} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/dcir/DcirRowExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/dcir/DcirRowExtractor.scala new file mode 100644 index 00000000..145300fb --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/dcir/DcirRowExtractor.scala @@ -0,0 +1,68 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.sources.dcir + +import java.sql.Timestamp +import scala.util.Try +import org.apache.commons.codec.binary.Base64 +import org.apache.spark.sql.Row +import fr.polytechnique.cmap.cnam.etl.extractors.EventRowExtractor +import fr.polytechnique.cmap.cnam.util.datetime.implicits._ + +/** + * Gets the following fields for DCIR sourced events: patientID, start, groupId. + */ +trait DcirRowExtractor extends DcirSource with EventRowExtractor { + + override def usedColumns: List[ColName] = List( + ColNames.PatientID, ColNames.DcirFluxDate, ColNames.DcirEventStart, + ColNames.FlowDistributionDate, ColNames.FlowTreatementDate, ColNames.FlowEmitterType, + ColNames.FlowEmitterId, ColNames.FlowEmitterNumber, + ColNames.OrgId, ColNames.OrderId, ColNames.DcirEventStart + ) ++ super.usedColumns + + def extractPatientId(r: Row): String = { + r.getAs[String](ColNames.PatientID) + } + + /** Trying to catch unknown dates + * example of unknown dates situation : IJ = Indemnité Journalière which are a replacement income + * paid by the HealthCare Insurance during a sick leave. + * + * @param r The Row object itself + * @return The date of the event or the flux date if it doesn't exist + */ + def extractStart(r: Row): Timestamp = { + Try(r.getAs[java.util.Date](ColNames.DcirEventStart).toTimestamp) recover { + case _: NullPointerException => extractFluxDate(r) + } + }.get + + + def extractFluxDate(r: Row): Timestamp = r.getAs[java.util.Date](ColNames.DcirFluxDate).toTimestamp + + /** Method to generate a hash value in a string format for the groupID value from a row with these values + * FLX_DIS_DTD,FLX_TRT_DTD,FLX_EMT_TYP,FLX_EMT_NUM,FLX_EMT_ORD,ORG_CLE_NUM,DCT_ORD_NUM. + * They are the 7 columns that identifies prescriptions in a unique way. + * + * @param r The Row object itself. + * @return A hash Id unique in a string format. + */ + override def extractGroupId(r: Row): String = { + Base64.encodeBase64( + s"${r.getAs[String](ColNames.FlowDistributionDate)}_${r.getAs[String](ColNames.FlowTreatementDate)}_${ + r.getAs[String]( + ColNames + .FlowEmitterType + ) + }_${r.getAs[String](ColNames.FlowEmitterId)}_${r.getAs[String](ColNames.FlowEmitterNumber)}_${ + r.getAs[String]( + ColNames + .OrgId + ) + }_${r.getAs[String](ColNames.OrderId)}".getBytes() + ).map(_.toChar).mkString + + } +} + diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/dcir/DcirSimpleExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/dcir/DcirSimpleExtractor.scala new file mode 100644 index 00000000..ccbf8f12 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/dcir/DcirSimpleExtractor.scala @@ -0,0 +1,13 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.sources.dcir + +import org.apache.spark.sql.DataFrame +import org.apache.spark.sql.functions.col +import fr.polytechnique.cmap.cnam.etl.events.AnyEvent +import fr.polytechnique.cmap.cnam.etl.extractors.SimpleExtractor +import fr.polytechnique.cmap.cnam.etl.sources.Sources + +trait DcirSimpleExtractor[EventType <: AnyEvent] extends DcirRowExtractor with SimpleExtractor[EventType] { + def getInput(sources: Sources): DataFrame = sources.dcir.get.select(neededColumns.map(col): _*) +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/dcir/DcirSource.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/dcir/DcirSource.scala similarity index 55% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/dcir/DcirSource.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/dcir/DcirSource.scala index b48edfb9..52d59b66 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/dcir/DcirSource.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/dcir/DcirSource.scala @@ -1,7 +1,8 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.dcir +package fr.polytechnique.cmap.cnam.etl.extractors.sources.dcir import fr.polytechnique.cmap.cnam.etl.extractors.ColumnNames +/** Trait to retrieve the columns of dcir dataframe. */ trait DcirSource extends ColumnNames { final object ColNames extends Serializable { @@ -16,32 +17,15 @@ trait DcirSource extends ColumnNames { lazy val GHSCode: String = "ER_ETE_F__ETE_GHS_NUM" lazy val InstitutionCode: String = "ER_ETE_F__ETE_TYP_COD" lazy val Sector: String = "ER_ETE_F__PRS_PPU_SEC" - lazy val Date: String = "EXE_SOI_DTD" lazy val NaturePrestation: ColName = "PRS_NAT_REF" lazy val NgapCoefficient: ColName = "PRS_ACT_CFT" - lazy val all = List( - PatientID, - CamCode, - GHSCode, - InstitutionCode, - Sector, - Date, - MSpe, - NonMSpe, - ExecPSNum, - DcirFluxDate, - NaturePrestation, - NgapCoefficient - ) - - - lazy val DateStart: ColName = "FLX_DIS_DTD" - lazy val DateEntry: ColName = "FLX_TRT_DTD" - lazy val EmitterType: ColName = "FLX_EMT_TYP" - lazy val EmitterId: ColName = "FLX_EMT_NUM" - lazy val FlowNumber: ColName = "FLX_EMT_ORD" - lazy val OrgId: ColName = "ORG_CLE_NUM" - lazy val OrderId: ColName = "DCT_ORD_NUM" + lazy val FlowDistributionDate: ColName = "FLX_DIS_DTD" + lazy val FlowTreatementDate: ColName = "FLX_TRT_DTD" + lazy val FlowEmitterType: ColName = "FLX_EMT_TYP" + lazy val FlowEmitterId: ColName = "FLX_EMT_NUM" + lazy val FlowEmitterNumber: ColName = "FLX_EMT_ORD" + lazy val OrgId: ColName = "ORG_CLE_NUM" + lazy val OrderId: ColName = "DCT_ORD_NUM" } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/had/HadRowExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/had/HadRowExtractor.scala new file mode 100644 index 00000000..e99771b4 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/had/HadRowExtractor.scala @@ -0,0 +1,30 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.sources.had + +import java.sql.Timestamp +import org.apache.spark.sql.Row +import fr.polytechnique.cmap.cnam.etl.extractors.EventRowExtractor + +/** + * Gets the following fields for HAD sourced events: patientID, start, groupId. + */ +trait HadRowExtractor extends HadSource with EventRowExtractor { + + override def usedColumns: List[String] = List( + ColNames.PatientID, ColNames.EtaNumEpmsi, ColNames.RhadNum, + NewColumns.Year, NewColumns.EstimatedStayStart, ColNames.StayStartDate + ) ++ super.usedColumns + + def extractPatientId(r: Row): String = { + r.getAs[String](ColNames.PatientID) + } + + override def extractGroupId(r: Row): String = { + r.getAs[String](ColNames.EtaNumEpmsi) + "_" + + r.getAs[String](ColNames.RhadNum) + "_" + + r.getAs[Int](NewColumns.Year).toString + } + + def extractStart(r: Row): Timestamp = r.getAs[Timestamp](NewColumns.EstimatedStayStart) +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/had/HadSimpleExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/had/HadSimpleExtractor.scala new file mode 100644 index 00000000..a2b6acb9 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/had/HadSimpleExtractor.scala @@ -0,0 +1,15 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.sources.had + +import org.apache.spark.sql.DataFrame +import org.apache.spark.sql.functions.col +import fr.polytechnique.cmap.cnam.etl.events.AnyEvent +import fr.polytechnique.cmap.cnam.etl.extractors.SimpleExtractor +import fr.polytechnique.cmap.cnam.etl.sources.Sources + +trait HadSimpleExtractor[EventType <: AnyEvent] extends HadRowExtractor with SimpleExtractor[EventType] { + override def getInput(sources: Sources): DataFrame = sources.had.get.estimateStayStartTime + .select(neededColumns.map(col): _*) + +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/had/HadSource.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/had/HadSource.scala similarity index 79% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/had/HadSource.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/had/HadSource.scala index f0d98608..afc5345c 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/had/HadSource.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/had/HadSource.scala @@ -1,10 +1,9 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.had +package fr.polytechnique.cmap.cnam.etl.extractors.sources.had -import fr.polytechnique.cmap.cnam.etl.extractors.ColumnNames -import fr.polytechnique.cmap.cnam.util.ColumnUtilities.parseTimestamp import org.apache.spark.sql.functions._ import org.apache.spark.sql.types.TimestampType import org.apache.spark.sql.{Column, DataFrame} +import fr.polytechnique.cmap.cnam.etl.extractors.ColumnNames trait HadSource extends ColumnNames { @@ -24,12 +23,8 @@ trait HadSource extends ColumnNames { val StayEndDate: ColName = "SOR_DAT" val StartDate: ColName = "EXE_SOI_DTD" val EndDate: ColName = "EXE_SOI_DTF" - val all = List( - PatientID, DP, DA, CCAM, PEC_PAL, PEC_ASS, EtaNumEpmsi, RhadNum, - StayStartDate, StayEndDate, StartDate, EndDate - ) - val hospitalStayPart = List( - PatientID, EtaNumEpmsi, RhadNum, StartDate, StayStartDate, StayEndDate, EndDate + val core: List[ColName] = List( + PatientID, EtaNumEpmsi, RhadNum, StayStartDate, StayEndDate, StartDate, EndDate ) } @@ -53,9 +48,11 @@ trait HadSource extends ColumnNames { val givenYear: Column = year(givenDate) df.withColumn( - NewColumns.EstimatedStayStart, givenDate) + NewColumns.EstimatedStayStart, givenDate + ) .withColumn( - NewColumns.Year, givenYear) + NewColumns.Year, givenYear + ) } } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/imb/ImbRowExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/imb/ImbRowExtractor.scala new file mode 100644 index 00000000..cc0d5af8 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/imb/ImbRowExtractor.scala @@ -0,0 +1,61 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.sources.imb + +import java.sql.{Date, Timestamp} +import scala.util.Try +import org.apache.spark.sql.Row +import fr.polytechnique.cmap.cnam.etl.extractors.EventRowExtractor +import fr.polytechnique.cmap.cnam.util.datetime +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +/** + * Gets the following fields for IMB sourced events: patientID, start, end, groupId. + * + * IR_IMB_R contains the Chronic Diseases Diagnoses or `ALD = Affection Longue Durée` for patients once + * they have been exonerated for all cares related to this chronic disease. + * It is the medical service of the health insurance that grants this ALD on the proposal of the + * patient's GP (Medecin Traitant). + * See the [online snds documentation for further details] + * (https://documentation-snds.health-data-hub.fr/fiches/beneficiaires_ald.html#le-dispositif-des-ald) + */ +trait ImbRowExtractor extends ImbSource with EventRowExtractor { + + def extractEncoding(row: Row): String = row.getAs[String](ColNames.Encoding) + + override def extractPatientId(row: Row): String = row.getAs[String](ColNames.PatientID) + + override def extractStart(row: Row): Timestamp = { + import datetime.implicits._ + + row.getAs[Date](ColNames.Date).toTimestamp + } + + /** + * The End date of the ALD is not always written. It can takes the value 1600-01-01 which + * corresponds to a None value (not set) that we convert to None. + * See the CNAM documentation [available here](https://documentation-snds.health-data-hub.fr/fiches/beneficiaires_ald.html#annexe) + * + * @param r + * @return + */ + override def extractEnd(r: Row): Option[Timestamp] = { + import datetime.implicits._ + Try( + { + val rawEndDate = r.getAs[java.util.Date](ColNames.EndDate).toTimestamp + + if (makeTS(1700, 1, 1).after(rawEndDate)) { + None + } + else { + Some(rawEndDate) + } + } + ) recover { + case _: NullPointerException => None + } + }.get +} + + diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/imb/ImbSimpleExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/imb/ImbSimpleExtractor.scala new file mode 100644 index 00000000..3bdf4a24 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/imb/ImbSimpleExtractor.scala @@ -0,0 +1,13 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.sources.imb + +import org.apache.spark.sql.DataFrame +import org.apache.spark.sql.functions.col +import fr.polytechnique.cmap.cnam.etl.events.AnyEvent +import fr.polytechnique.cmap.cnam.etl.extractors.SimpleExtractor +import fr.polytechnique.cmap.cnam.etl.sources.Sources + +trait ImbSimpleExtractor[EventType <: AnyEvent] extends ImbRowExtractor with SimpleExtractor[EventType]{ + def getInput(sources: Sources): DataFrame = sources.irImb.get.select(neededColumns.map(col): _*) +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/imb/ImbSource.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/imb/ImbSource.scala new file mode 100644 index 00000000..bc2f1ca4 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/imb/ImbSource.scala @@ -0,0 +1,15 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.sources.imb + +import fr.polytechnique.cmap.cnam.etl.extractors.ColumnNames + +trait ImbSource extends ColumnNames{ + final object ColNames extends Serializable { + final lazy val PatientID = "NUM_ENQ" + final lazy val Encoding = "MED_NCL_IDT" + final lazy val Code = "MED_MTF_COD" + final lazy val Date = "IMB_ALD_DTD" + final lazy val EndDate = "IMB_ALD_DTF" + } +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mco/McoRowExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mco/McoRowExtractor.scala new file mode 100644 index 00000000..58550be0 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mco/McoRowExtractor.scala @@ -0,0 +1,45 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.sources.mco + +import java.sql.Timestamp +import org.apache.spark.sql.Row +import fr.polytechnique.cmap.cnam.etl.extractors.EventRowExtractor + +/** + * Gets the following fields for MCO sourced events: patientID, start, groupId. + */ +trait McoRowExtractor extends McoSource with EventRowExtractor { + + override def usedColumns: List[String] = ColNames.core ++ super.usedColumns + + /** It gets PatientID value from MCO source. + * + * @param r The row itself. + * @return The value of PatientID. + */ + def extractPatientId(r: Row): String = { + r.getAs[String](ColNames.PatientID) + } + + /** Creates an ID that group Events of different categories + * by concatenating ETA_NUM, RSA_NUM and the YEAR. + * + * @param r The row itself. + * @return The value of groupId. + */ + override def extractGroupId(r: Row): String = { + r.getAs[String](ColNames.EtaNum) + "_" + + r.getAs[String](ColNames.RsaNum) + "_" + + r.getAs[Int](ColNames.Year).toString + } + + /** Extracts the EstimatedStayStart as the start. + * It comes from the method [[McoDataFrame.estimateStayStartTime]]. + * + * @param r The row itself. + * @return The value of EstimatedStayStart. + */ + def extractStart(r: Row): Timestamp = r.getAs[Timestamp](NewColumns.EstimatedStayStart) +} + diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mco/McoSimpleExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mco/McoSimpleExtractor.scala new file mode 100644 index 00000000..f131226a --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mco/McoSimpleExtractor.scala @@ -0,0 +1,14 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.sources.mco + +import org.apache.spark.sql.DataFrame +import org.apache.spark.sql.functions.col +import fr.polytechnique.cmap.cnam.etl.events.AnyEvent +import fr.polytechnique.cmap.cnam.etl.extractors.SimpleExtractor +import fr.polytechnique.cmap.cnam.etl.sources.Sources + +trait McoSimpleExtractor[EventType <: AnyEvent] extends McoRowExtractor with SimpleExtractor[EventType]{ + def getInput(sources: Sources): DataFrame = + sources.mco.get.select(neededColumns.map(col): _*).estimateStayStartTime +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/mco/McoSource.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mco/McoSource.scala similarity index 85% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/mco/McoSource.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mco/McoSource.scala index 38e4da09..9caebc4e 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/mco/McoSource.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mco/McoSource.scala @@ -1,6 +1,6 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.mco +package fr.polytechnique.cmap.cnam.etl.extractors.sources.mco import org.apache.spark.sql.functions._ import org.apache.spark.sql.types.{LongType, TimestampType} @@ -28,12 +28,20 @@ trait McoSource extends ColumnNames { val StayEndDate: ColName = "SOR_DAT" val StartDate: ColName = "EXE_SOI_DTD" val EndDate: ColName = "EXE_SOI_DTF" + val CCAMDelayDate: ColName = "MCO_A__ENT_DAT_DEL" + val StayFrom: ColName = "MCO_B__ENT_MOD" + val StayFromType: ColName = "MCO_B__ENT_PRV" + + val core = List( + PatientID, EtaNum, RsaNum, Year, StayEndMonth, StayEndYear, StayLength, + StayStartDate, StayEndDate, StartDate, EndDate + ) val all = List( PatientID, DP, DR, DA, CCAM, GHM, EtaNum, RsaNum, Year, ExitMode, StayEndMonth, StayEndYear, StayLength, - StayStartDate, StayEndDate, StartDate, EndDate + StayStartDate, StayEndDate, StartDate, EndDate, CCAMDelayDate ) val hospitalStayPart = List( - PatientID, EtaNum, RsaNum, Year, StartDate, EndDate + PatientID, EtaNum, RsaNum, Year, StartDate, EndDate, StayFrom, StayFromType ) } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mcoce/McoCeRowExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mcoce/McoCeRowExtractor.scala new file mode 100644 index 00000000..fd370f75 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mcoce/McoCeRowExtractor.scala @@ -0,0 +1,37 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.sources.mcoce + +import java.sql.Timestamp +import org.apache.spark.sql.Row +import fr.polytechnique.cmap.cnam.etl.extractors.EventRowExtractor +import fr.polytechnique.cmap.cnam.util.datetime.implicits._ + +/** + * Gets the following fields for MCO_CE sourced events: patientID, start, groupId. + */ +trait McoCeRowExtractor extends McoCeSource with EventRowExtractor { + override def usedColumns: List[String] = super.usedColumns ++ List( + ColNames.PatientID, ColNames.EtaNum, + ColNames.SeqNum, ColNames.Year, + ColNames.StartDate + ) + + def extractPatientId(r: Row): String = { + r.getAs[String](ColNames.PatientID) + } + + /** Return groupID as hospital stay ID + * + * @param r + * @return groupId which is the unique ID of the hospital stay + */ + override def extractGroupId(r: Row): String = { + r.getAs[String](ColNames.EtaNum) + "_" + + r.getAs[String](ColNames.SeqNum) + "_" + + r.getAs[Int](ColNames.Year).toString + } + + def extractStart(r: Row): Timestamp = r.getAs[Timestamp](ColNames.StartDate).toTimestamp + +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mcoce/McoCeSimpleExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mcoce/McoCeSimpleExtractor.scala new file mode 100644 index 00000000..38096ea3 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mcoce/McoCeSimpleExtractor.scala @@ -0,0 +1,11 @@ +package fr.polytechnique.cmap.cnam.etl.extractors.sources.mcoce + +import org.apache.spark.sql.DataFrame +import org.apache.spark.sql.functions.col +import fr.polytechnique.cmap.cnam.etl.events.AnyEvent +import fr.polytechnique.cmap.cnam.etl.extractors.SimpleExtractor +import fr.polytechnique.cmap.cnam.etl.sources.Sources + +trait McoCeSimpleExtractor[EventType <: AnyEvent] extends McoCeRowExtractor with SimpleExtractor[EventType]{ + def getInput(sources: Sources): DataFrame = sources.mcoCe.get.select(neededColumns.map(col): _*) +} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mcoce/McoCeSource.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mcoce/McoCeSource.scala new file mode 100644 index 00000000..b9128afd --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mcoce/McoCeSource.scala @@ -0,0 +1,46 @@ +package fr.polytechnique.cmap.cnam.etl.extractors.sources.mcoce + +import fr.polytechnique.cmap.cnam.etl.extractors.ColumnNames + +trait McoCeSource extends ColumnNames { + + final object ColNames extends Serializable { + // Essential for all the Extractors + val PatientID: ColName = "NUM_ENQ" + val EtaNum: ColName = "ETA_NUM" + val SeqNum: ColName = "SEQ_NUM" + val CamCode = "MCO_FMSTC__CCAM_COD" + val Year = "year" + + // NGAP from FBSTC + val NgapKeyLetterFbstc = "MCO_FBSTC__ACT_COD" + val NgapCoefficientFbstc = "MCO_FBSTC__ACT_COE" + + // Practionner from FBSTC + val PractitionnerSpecialtyFbstc = "MCO_FBSTC__EXE_SPE" + + // NGAP for FSCTC + val NgapKeyLetterFcstc = "MCO_FCSTC__ACT_COD" + val NgapCoefficientFcstc = "MCO_FCSTC__ACT_COE" + + // Practionner from FCSTC + val PractitionnerSpecialtyFcstc = "MCO_FCSTC__EXE_SPE" + + val StartDate: String = "EXE_SOI_DTD" + val EndDate: String = "EXE_SOI_DTF" + val ActCode: String = "MCO_FBSTC__ACT_COD" + + val core = List( + PatientID, EtaNum, SeqNum, Year, StartDate + ) + + val all = List( + PatientID, EtaNum, SeqNum, Year, CamCode, StartDate, + NgapKeyLetterFbstc, NgapCoefficientFbstc, PractitionnerSpecialtyFbstc, + NgapKeyLetterFcstc, NgapCoefficientFcstc, PractitionnerSpecialtyFcstc + ) + + + } + +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssr/SsrRowExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssr/SsrRowExtractor.scala new file mode 100644 index 00000000..e99e49aa --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssr/SsrRowExtractor.scala @@ -0,0 +1,26 @@ +package fr.polytechnique.cmap.cnam.etl.extractors.sources.ssr + +import java.sql.Timestamp +import org.apache.spark.sql.Row +import fr.polytechnique.cmap.cnam.etl.extractors.EventRowExtractor + +/** + * Gets the following fields for SSR sourced events: patientID, start, groupId. + */ +trait SsrRowExtractor extends SsrSource with EventRowExtractor { + + override def usedColumns: List[String] = ColNames.core ++ super.usedColumns + + def extractPatientId(r: Row): String = { + r.getAs[String](ColNames.PatientID) + } + + override def extractGroupId(r: Row): String = { + r.getAs[String](ColNames.EtaNum) + "_" + + r.getAs[String](ColNames.RhaNum) + "_" + + r.getAs[String](ColNames.RhsNum) + "_" + + r.getAs[Int](ColNames.Year).toString + } + + def extractStart(r: Row): Timestamp = r.getAs[Timestamp](NewColumns.EstimatedStayStart) +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssr/SsrSimpleExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssr/SsrSimpleExtractor.scala new file mode 100644 index 00000000..d21640bc --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssr/SsrSimpleExtractor.scala @@ -0,0 +1,13 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.sources.ssr + +import org.apache.spark.sql.DataFrame +import org.apache.spark.sql.functions.col +import fr.polytechnique.cmap.cnam.etl.events.AnyEvent +import fr.polytechnique.cmap.cnam.etl.extractors.SimpleExtractor +import fr.polytechnique.cmap.cnam.etl.sources.Sources + +trait SsrSimpleExtractor[EventType <: AnyEvent] extends SsrRowExtractor with SimpleExtractor[EventType]{ + def getInput(sources: Sources): DataFrame = sources.ssr.get.estimateStayStartTime.select(neededColumns.map(col): _*) +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/ssr/SsrSource.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssr/SsrSource.scala similarity index 76% rename from src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/ssr/SsrSource.scala rename to src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssr/SsrSource.scala index 21b1255f..ab6edbb0 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/ssr/SsrSource.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssr/SsrSource.scala @@ -1,4 +1,4 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.ssr +package fr.polytechnique.cmap.cnam.etl.extractors.sources.ssr import org.apache.spark.sql.functions._ import org.apache.spark.sql.types.{LongType, TimestampType} @@ -9,26 +9,37 @@ import fr.polytechnique.cmap.cnam.util.ColumnUtilities.parseTimestamp trait SsrSource extends ColumnNames { final object ColNames extends Serializable { - val PatientID: ColName = "SSR_C__NUM_ENQ" - val DP: ColName = "MOR_PRP" - val DR: ColName = "ETL_AFF" + val PatientID: ColName = "NUM_ENQ" + val StayStartMonth: ColName = "MOI_LUN_1S" + val StayStartYear: ColName = "ANN_LUN_1S" + val StayStartDate: ColName = "ENT_DAT" + val StayEndDate: ColName = "SOR_DAT" + val StartDate: ColName = "EXE_SOI_DTD" + val EndDate: ColName = "EXE_SOI_DTF" + val EtaNum: ColName = "ETA_NUM" + val RhaNum: ColName = "RHA_NUM" + val RhsNum: ColName = "RHS_NUM" + val Year: ColName = "year" + + val core = List( + PatientID, StayStartMonth, StayStartYear, StayStartDate, StayStartDate, StayEndDate, StartDate, EndDate, + EtaNum, RhaNum, RhsNum, Year, NewColumns.EstimatedStayStart + ) + val StayLength: ColName = "SSR_B__RHS_ANT_SEJ_ENT" + val DP: ColName = "SSR_B__MOR_PRP" + + val DR: ColName = "SSR_B__ETL_AFF" + val DA: ColName = "SSR_D__DGN_COD" + val CCAM: ColName = "SSR_CCAM__CCAM_ACT" // present only in 2014-2015-2016, should be addeed for the studies on the echantillon + val CSARR: ColName = "SSR_CSARR__CSARR_COD" - val FP_PEC: ColName = "FP_PEC" + + val FP_PEC: ColName = "SSR_B__FP_PEC" // MOI_ANN_SOR_SEJ ? //val GHM: ColName = "SSR_B__GRG_GHM" -> GME TODO - val EtaNum: ColName = "ETA_NUM" - val RhaNum: ColName = "RHA_NUM" - val RhsNum: ColName = "RHS_NUM" - val Year: ColName = "year" - val StayStartMonth: ColName = "SSR_C__MOI_LUN_1S" - val StayStartYear: ColName = "SSR_C__ANN_LUN_1S" - val StayLength: ColName = "RHS_ANT_SEJ_ENT" - val StayStartDate: ColName = "SSR_C__ENT_DAT" - val StayEndDate: ColName = "SSR_C__SOR_DAT" - val StartDate: ColName = "SSR_C__EXE_SOI_DTD" - val EndDate: ColName = "SSR_C__EXE_SOI_DTF" + val all = List( PatientID, DP, DR, DA, CCAM, CSARR, FP_PEC, EtaNum, RhaNum, RhsNum, StayLength, //CSARR, StayStartDate, StayEndDate, StartDate, EndDate, Year, StayStartMonth, StayStartYear diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssrce/SsrCeRowExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssrce/SsrCeRowExtractor.scala new file mode 100644 index 00000000..4f0a3ee4 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssrce/SsrCeRowExtractor.scala @@ -0,0 +1,18 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.sources.ssrce + +import java.sql.Timestamp +import org.apache.spark.sql.Row +import fr.polytechnique.cmap.cnam.etl.extractors.EventRowExtractor + +/** + * Gets the following fields for SSR_CE sourced events: patientID and start. + */ +trait SsrCeRowExtractor extends SsrCeSource with EventRowExtractor { + override def usedColumns: List[String] = ColNames.core ++ super.usedColumns + + override def extractPatientId(row: Row): String = row.getAs[String](ColNames.PatientID) + + override def extractStart(row: Row): Timestamp = row.getAs[Timestamp](ColNames.StartDate) +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssrce/SsrCeSimpleExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssrce/SsrCeSimpleExtractor.scala new file mode 100644 index 00000000..51365d49 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssrce/SsrCeSimpleExtractor.scala @@ -0,0 +1,13 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.sources.ssrce + +import org.apache.spark.sql.DataFrame +import org.apache.spark.sql.functions.col +import fr.polytechnique.cmap.cnam.etl.events.AnyEvent +import fr.polytechnique.cmap.cnam.etl.extractors.SimpleExtractor +import fr.polytechnique.cmap.cnam.etl.sources.Sources + +trait SsrCeSimpleExtractor [EventType <: AnyEvent] extends SsrCeRowExtractor with SimpleExtractor[EventType]{ + def getInput(sources: Sources): DataFrame = sources.ssrCe.get.select(neededColumns.map(col): _*) +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssrce/SsrCeSource.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssrce/SsrCeSource.scala new file mode 100644 index 00000000..bc970006 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssrce/SsrCeSource.scala @@ -0,0 +1,18 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.sources.ssrce + +import fr.polytechnique.cmap.cnam.etl.extractors.ColumnNames + +trait SsrCeSource extends ColumnNames { + final object ColNames extends Serializable { + final lazy val PatientID = "NUM_ENQ" + final lazy val StartDate = "EXE_SOI_DTD" + final lazy val core = List( + PatientID, StartDate + ) + + final lazy val CamCode = "SSR_FMSTC__CCAM_COD" + final lazy val all = List(PatientID, CamCode, StartDate) + } +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/ssr/SsrExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/ssr/SsrExtractor.scala deleted file mode 100644 index 3578dfaa..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/ssr/SsrExtractor.scala +++ /dev/null @@ -1,47 +0,0 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.ssr - -import java.sql.Timestamp -import org.apache.spark.sql.functions.col -import org.apache.spark.sql.{DataFrame, Row} -import fr.polytechnique.cmap.cnam.etl.events.{AnyEvent, Event, EventBuilder} -import fr.polytechnique.cmap.cnam.etl.extractors.{EventRowExtractor, Extractor} -import fr.polytechnique.cmap.cnam.etl.sources.Sources - -trait SsrExtractor[EventType <: AnyEvent] extends Extractor[EventType] with SsrSource with EventRowExtractor { - - val columnName: String - - val eventBuilder: EventBuilder - - def getInput(sources: Sources): DataFrame = sources.ssr.get.select(ColNames.all.map(col): _*).estimateStayStartTime - - def isInStudy(codes: Set[String]) - (row: Row): Boolean = codes.exists(code(row).startsWith(_)) - - def isInExtractorScope(row: Row): Boolean = !row.isNullAt(row.fieldIndex(columnName)) - - def builder(row: Row): Seq[Event[EventType]] = { - lazy val patientId = extractPatientId(row) - lazy val groupId = extractGroupId(row) - lazy val eventDate = extractStart(row) - lazy val endDate = extractEnd(row) - lazy val weight = extractWeight(row) - - Seq(eventBuilder[EventType](patientId, groupId, code(row), weight, eventDate, endDate)) - } - - def code = (row: Row) => row.getAs[String](columnName) - - def extractPatientId(r: Row): String = { - r.getAs[String](ColNames.PatientID) - } - - override def extractGroupId(r: Row): String = { - r.getAs[String](ColNames.EtaNum) + "_" + - r.getAs[String](ColNames.RhaNum) + "_" + - r.getAs[String](ColNames.RhsNum) + "_" + - r.getAs[Int](ColNames.Year).toString - } - - def extractStart(r: Row): Timestamp = r.getAs[Timestamp](NewColumns.EstimatedStayStart) -} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/takeOverReasons/HadTakeOverReasonExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/takeOverReasons/HadTakeOverReasonExtractor.scala deleted file mode 100644 index e2a62949..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/takeOverReasons/HadTakeOverReasonExtractor.scala +++ /dev/null @@ -1,28 +0,0 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.takeOverReasons - -import fr.polytechnique.cmap.cnam.etl.events._ -import fr.polytechnique.cmap.cnam.etl.extractors.had.HadExtractor -import fr.polytechnique.cmap.cnam.etl.extractors.takeOverReasons.HadMainTakeOverExtractor.code -import org.apache.spark.sql.Row - -object HadMainTakeOverExtractor extends HadExtractor[MedicalTakeOverReason] { - - final override val columnName: String = ColNames.PEC_PAL - - override val eventBuilder: EventBuilder = HadMainTakeOver - - override def isInStudy(codes: Set[String]) - (row: Row): Boolean = codes.exists(code(row) == _) -} - -object HadAssociatedTakeOverExtractor extends HadExtractor[MedicalTakeOverReason] { - - final override val columnName: String = ColNames.PEC_ASS - - override val eventBuilder: EventBuilder = HadAssociatedTakeOver - - override def isInStudy(codes: Set[String]) - (row: Row): Boolean = codes.exists(code(row) == _) -} - - diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/tracklosses/Tracklosses.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/tracklosses/Tracklosses.scala deleted file mode 100644 index 615b9fc2..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/tracklosses/Tracklosses.scala +++ /dev/null @@ -1,64 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors.tracklosses - -import java.sql.Timestamp -import org.apache.spark.sql.expressions.Window -import org.apache.spark.sql.functions._ -import org.apache.spark.sql.types.TimestampType -import org.apache.spark.sql.{Column, DataFrame, Dataset} -import fr.polytechnique.cmap.cnam.etl.events.{Event, Trackloss} -import fr.polytechnique.cmap.cnam.etl.sources.Sources - -class Tracklosses(config: TracklossesConfig) { - - import Tracklosses._ - - def extract(sources: Sources): Dataset[Event[Trackloss]] = { - - val dcir: DataFrame = sources.dcir.get - - import dcir.sqlContext.implicits._ - dcir.select(inputColumns: _*) - .filter(col("drug").isNotNull) - .select(col("patientID"), col("eventDate")) - .distinct - .withInterval(config.studyEnd) - .filterTrackLosses(config.emptyMonths) - .withTrackLossDate(config.tracklossMonthDelay) - .map(Trackloss.fromRow(_, dateCol = "tracklossDate")) - } -} - -object Tracklosses { - - val inputColumns: List[Column] = List( - col("NUM_ENQ").as("patientID"), - coalesce( - col("ER_PHA_F__PHA_PRS_IDE"), - col("ER_PHA_F__PHA_PRS_C13") - ).as("drug"), - col("EXE_SOI_DTD").as("eventDate") - ) - - implicit class TracklossesDataFrame(data: DataFrame) { - - def withInterval(lastDate: Timestamp): DataFrame = { - val window = Window.partitionBy(col("patientID")).orderBy(col("eventDate").asc) - data - .withColumn("nextDate", lead(col("eventDate"), 1, lastDate).over(window)) - .filter(col("nextDate").isNotNull) - .withColumn("interval", months_between(col("nextDate"), col("eventDate")).cast("int")) - .drop(col("nextDate")) - } - - def filterTrackLosses(emptyMonths: Int): DataFrame = { - data.filter(col("interval") >= emptyMonths) - } - - def withTrackLossDate(tracklossMonthDelay: Int): DataFrame = { - data.withColumn("tracklossDate", add_months(col("eventDate"), tracklossMonthDelay).cast(TimestampType)) - } - } - -} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/tracklosses/TracklossesConfig.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/tracklosses/TracklossesConfig.scala deleted file mode 100644 index c7daa5c9..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/extractors/tracklosses/TracklossesConfig.scala +++ /dev/null @@ -1,12 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors.tracklosses - -import java.sql.Timestamp -import fr.polytechnique.cmap.cnam.etl.config.CaseClassConfig - -case class TracklossesConfig( - studyEnd: Timestamp, - emptyMonths: Int = 4, - tracklossMonthDelay: Int = 2) - extends CaseClassConfig diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/filters/PatientFiltersImplicits.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/filters/PatientFiltersImplicits.scala index 12fdd3ac..72956bfe 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/filters/PatientFiltersImplicits.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/filters/PatientFiltersImplicits.scala @@ -6,10 +6,9 @@ import java.sql.Timestamp import org.apache.spark.sql.expressions.Window import org.apache.spark.sql.functions._ import org.apache.spark.sql.types.{BooleanType, TimestampType} -import org.apache.spark.sql.{Column, DataFrame, Dataset} +import org.apache.spark.sql.{Column, Dataset} import fr.polytechnique.cmap.cnam.etl.events._ import fr.polytechnique.cmap.cnam.etl.patients.Patient -import fr.polytechnique.cmap.cnam.util.RichDataFrame._ /* * The architectural decisions regarding the patient filters can be found in the following page: diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/Sources.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/Sources.scala index 67097fb8..b3ae1114 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/Sources.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/Sources.scala @@ -5,8 +5,8 @@ package fr.polytechnique.cmap.cnam.etl.sources import java.sql.Timestamp import org.apache.spark.sql.{DataFrame, SQLContext} import fr.polytechnique.cmap.cnam.etl.config.study.StudyConfig.InputPaths -import fr.polytechnique.cmap.cnam.etl.sources.data.{DcirSource, McoCeSource, McoSource, SsrSource, SsrCeSource, HadSource} -import fr.polytechnique.cmap.cnam.etl.sources.value.{DosagesSource, IrBenSource, IrImbSource, IrPhaSource} +import fr.polytechnique.cmap.cnam.etl.sources.data._ +import fr.polytechnique.cmap.cnam.etl.sources.value._ case class Sources( dcir: Option[DataFrame] = None, @@ -18,10 +18,15 @@ case class Sources( irBen: Option[DataFrame] = None, irImb: Option[DataFrame] = None, irPha: Option[DataFrame] = None, + irNat: Option[DataFrame] = None, dosages: Option[DataFrame] = None) object Sources { - + /** Sanitize all sources with usual filters for snds analysis. + * + * @param sources An instance containing all available SNDS data and value tables. + * @return + */ def sanitize(sources: Sources): Sources = { sources.copy( dcir = sources.dcir.map(DcirSource.sanitize), @@ -33,10 +38,18 @@ object Sources { irBen = sources.irBen.map(IrBenSource.sanitize), irImb = sources.irImb.map(IrImbSource.sanitize), irPha = sources.irPha.map(IrPhaSource.sanitize), + irNat = sources.irNat.map(IrNatSource.sanitize), dosages = sources.dosages.map(DosagesSource.sanitize) ) } + /** Filter sources to keep only data concerning the study period. + * + * @param sources An instance containing all available SNDS data and value tables. + * @param studyStart + * @param studyEnd + * @return + */ def sanitizeDates(sources: Sources, studyStart: Timestamp, studyEnd: Timestamp): Sources = { sources.copy( dcir = sources.dcir.map(DcirSource.sanitizeDates(_, studyStart, studyEnd)), @@ -44,13 +57,21 @@ object Sources { ssr = sources.ssr.map(SsrSource.sanitizeDates(_, studyStart, studyEnd)), had = sources.had.map(HadSource.sanitizeDates(_, studyStart, studyEnd)), mcoCe = sources.mcoCe.map(McoCeSource.sanitizeDates(_, studyStart, studyEnd)), + ssrCe = sources.ssrCe.map(SsrCeSource.sanitizeDates(_, studyStart, studyEnd)), irBen = sources.irBen, irImb = sources.irImb, irPha = sources.irPha, + irNat = sources.irNat, dosages = sources.dosages ) } + /** Read all source dataframe. + * + * @param sqlContext Spark Context needed to fetch data + * @param paths + * @return + */ def read(sqlContext: SQLContext, paths: InputPaths): Sources = { this.read( sqlContext, @@ -59,9 +80,11 @@ object Sources { mcoCePath = paths.mcoCe, hadPath = paths.had, ssrPaths = paths.ssr, + ssrCePath = paths.ssrCe, irBenPath = paths.irBen, irImbPath = paths.irImb, irPhaPath = paths.irPha, + irNatPath = paths.irNat, dosagesPath = paths.dosages ) } @@ -72,22 +95,25 @@ object Sources { mcoPath: Option[String] = None, mcoCePath: Option[String] = None, hadPath: Option[String] = None, - //@todo The merge of ssr_sej and ssr_c should be finally moved to the Flattening project - ssrPaths: Option[List[String]] = None, + ssrPaths: Option[String] = None, + ssrCePath: Option[String] = None, irBenPath: Option[String] = None, irImbPath: Option[String] = None, irPhaPath: Option[String] = None, + irNatPath: Option[String] = None, dosagesPath: Option[String] = None): Sources = { Sources( dcir = dcirPath.map(DcirSource.read(sqlContext, _)), mco = mcoPath.map(McoSource.read(sqlContext, _)), mcoCe = mcoCePath.map(McoCeSource.read(sqlContext, _)), - had = hadPath.map(HadSource.read(sqlContext, _)), ssr = ssrPaths.map(SsrSource.read(sqlContext, _)), + ssrCe = ssrCePath.map(SsrCeSource.read(sqlContext, _)), + had = hadPath.map(HadSource.read(sqlContext, _)), irBen = irBenPath.map(IrBenSource.read(sqlContext, _)), irImb = irImbPath.map(IrImbSource.read(sqlContext, _)), irPha = irPhaPath.map(IrPhaSource.read(sqlContext, _)), + irNat = irNatPath.map(IrNatSource.read(sqlContext, _)), dosages = dosagesPath.map(DosagesSource.read(sqlContext, _)) ) } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/DataSourceManager.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/DataSourceManager.scala index 86c97ce2..36ec62fd 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/DataSourceManager.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/DataSourceManager.scala @@ -11,8 +11,7 @@ trait DataSourceManager extends SourceManager { val EXE_SOI_DTD: Column = col("EXE_SOI_DTD") - /** - * This method santize the sources based on the passed dates. + /** Sanitize the sources based on the passed dates. * * @param sourceData the data source that will be sanitized * @param studyStart the study start date diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/DcirFilters.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/DcirFilters.scala index 547c2b44..a88f417f 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/DcirFilters.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/DcirFilters.scala @@ -5,6 +5,10 @@ package fr.polytechnique.cmap.cnam.etl.sources.data import org.apache.spark.sql.DataFrame private[data] class DcirFilters(rawDcir: DataFrame) { + /** Remove lines for information only. + * + * @return dataframe with lines corresponding to some real interaction with the healthcare system. + */ def filterInformationFlux: DataFrame = { rawDcir.where(DcirSource.DPN_QLF =!= 71) } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/DcirSource.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/DcirSource.scala index ee6393c8..2272c55e 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/DcirSource.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/DcirSource.scala @@ -12,6 +12,12 @@ object DcirSource extends DataSourceManager with DcirSourceSanitizer { val BEN_CDI_NIR: Column = col("BEN_CDI_NIR") val DPN_QLF: Column = col("DPN_QLF") + /** Sanitize the dcir with usual filter for analysis + * - remove the lines without a proper *Nature de la prestation* + * - remove the lines for information + * @param dcir the data source that will be sanitized + * @return a new instance of the Source, with the sanitized data + */ override def sanitize(dcir: DataFrame): DataFrame = { dcir.where(DcirSource.BSE_PRS_NAT =!= 0) .filterInformationFlux diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/DoublonFinessPmsi.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/DoublonFinessPmsi.scala new file mode 100644 index 00000000..855f762f --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/DoublonFinessPmsi.scala @@ -0,0 +1,39 @@ +package fr.polytechnique.cmap.cnam.etl.sources.data + +object DoublonFinessPmsi { + /** List of geogaphic FINESS for APHP, HCL and APHM (duplicates for this information also goes back through legal FINESS. + * This list is detailed on the [snds documentation](https://documentation-snds.health-data-hub.fr/fiches/depenses_hopital_public.html#valorisation-des-sejours-a-l-hopital-public) + * and recommended by people building the "Reste à Charge" Database on the SNDS. It is more exhaustive than the precedent list. But it should + */ + val specialHospitalCodes = List( + //APHP + "600100093","600100101","620100016","640790150","640797098","750100018","750806226", + "750100356","750802845","750801524","750100067","750100075","750100042","750805228", + "750018939","750018988","750100091","750100083","750100109","750833345","750019069", + "750803306","750019028","750100125","750801441","750019119","750100166","750100141", + "750100182","750100315","750019648","750830945","750008344","750803199","750803447", + "750100216","750100208","750833337","750000358","750019168","750809576","750100299", + "750041543","750100232","750802258","750803058","750803454","750100273","750801797", + "750803371","830100012","830009809","910100015","910100031","910100023","910005529", + "920100013","920008059","920100021","920008109","920100039","920100047","920812930", + "920008158","920100054","920008208","920100062","920712551","920000122","930100052", + "930100037","930018684","930812334","930811294","930100045","930011408","930811237", + "930100011","940018021","940100027","940100019","940170087","940005739","940100076", + "940100035","940802291","940100043","940019144","940005788","940100050","940802317", + "940100068","940005838","950100024","950100016", + //APHM + "130808231","130809775","130782931", + "130806003","130783293","130804305","130790330","130804297","130783236","130796873", + "130808520","130799695","130802085","130808256","130806052","130808538","130802101", + "130796550","130014558","130784234","130035884","130784259","130796279","130792856", + "130017239","130792534","130793698","130792898","130808546","130789175","130780521", + "130033996","130018229", + //HCL + "90787460","690007422","690007539","690784186","690787429", + "690783063","690007364","690787452","690007406","690787486","690784210","690799416", + "690784137","690007281","690799366","690784202","690023072","690787577","690784194", + "690007380","690784129","690029194","690806054","690029210","690787767","690784178", + "690783154","690799358","690787817","690787742","690784152","690784145","690783121", + "690787478","690007455","690787494","830100558","830213484" + ) +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/HadFilters.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/HadFilters.scala index 2b0d59f0..cf2b7101 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/HadFilters.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/HadFilters.scala @@ -1,13 +1,13 @@ package fr.polytechnique.cmap.cnam.etl.sources.data import org.apache.spark.sql.{Column, DataFrame} +import fr.polytechnique.cmap.cnam.etl.sources.data.DoublonFinessPmsi.specialHospitalCodes private[data] class HadFilters(rawHad: DataFrame) { - - /** Removing return codes which significate error in the PMSI - * - * This is a classic filter for all PMSI products. Other filters may be implemented in the future. - * */ + /** Filter out Had corrupted stays as returned by the ATIH. + * + * @return dataframe cleaned of HAD corrupted stays + */ def filterHadCorruptedHospitalStays: DataFrame = { val fictionalAndFalseHospitalStaysFilter: Column = HadSource .NIR_RET === "0" and HadSource.SEJ_RET === "0" and HadSource @@ -17,5 +17,12 @@ private[data] class HadFilters(rawHad: DataFrame) { rawHad.filter(fictionalAndFalseHospitalStaysFilter) } + /** Remove geographic finess doublons from APHP, APHM and HCL. + * + * @return dataframe without finess doublons + */ + def filterSpecialHospitals: DataFrame = { + rawHad.where(!HadSource.ETA_NUM_EPMSI.isin(specialHospitalCodes: _*)) + } } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/HadSource.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/HadSource.scala index c0af76ef..47f78cec 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/HadSource.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/HadSource.scala @@ -1,36 +1,23 @@ package fr.polytechnique.cmap.cnam.etl.sources.data -import org.apache.spark.sql.functions.{col, to_date, year} -import org.apache.spark.sql.{Column, DataFrame, SQLContext} +import org.apache.spark.sql.functions.col +import org.apache.spark.sql.{Column, DataFrame} /** - * Extractor class for the HAD table - * - * + * Extractor class for the SSR table + * This filtering is explained here + * https://datainitiative.atlassian.net/wiki/pages/viewpage.action?pageId=40304642 */ object HadSource extends DataSourceManager with HadSourceSanitizer { - // unused for filtering -// val ETA_NUM_EPMSI: Column = col("ETA_NUM_EPMSI") -// val RHAD_NUM: Column = col("RHAD_NUM") -// val DP: Column = col("HAD_B__DGN_PAL") -// val PEC_PAL: Column = col("HAD_B_PEC_PAL") -// val PEC_ASS: Column = col("HAD_B_PEC_ASS") -// val DA: Column = col("HAD_D__DGN_ASS") -// val CCAM: Column = col("HAD_A__CCAM_COD") - + val ETA_NUM_EPMSI: Column = col("ETA_NUM_EPMSI") val NIR_RET: Column = col("NIR_RET") val SEJ_RET: Column = col("SEJ_RET") val FHO_RET: Column = col("FHO_RET") val PMS_RET: Column = col("PMS_RET") val DAT_RET: Column = col("DAT_RET") - -// val ENT_DAT: Column = col("ENT_DAT") -// val SOR_DAT: Column = col("SOR_DAT") val Year: Column = col("year") - //val foreignKeys: List[String] = List("ETA_NUM_EPMSI", "RHA_NUM", "year") - override def sanitize(rawHad: DataFrame): DataFrame = { /** * This filtering is explained here @@ -38,5 +25,6 @@ object HadSource extends DataSourceManager with HadSourceSanitizer { */ rawHad .filterHadCorruptedHospitalStays + .filterSpecialHospitals } } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/McoFilters.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/McoFilters.scala index 9ecec783..9ae8a425 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/McoFilters.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/McoFilters.scala @@ -3,13 +3,21 @@ package fr.polytechnique.cmap.cnam.etl.sources.data import org.apache.spark.sql.{Column, DataFrame} +import fr.polytechnique.cmap.cnam.etl.sources.data.DoublonFinessPmsi.specialHospitalCodes private[data] class McoFilters(rawMco: DataFrame) { - + /** Filter out Finess doublons from APHP, APHM and HCL + * + * @return + */ def filterSpecialHospitals: DataFrame = { - rawMco.where(!McoSource.ETA_NUM.isin(McoFilters.specialHospitalCodes: _*)) + rawMco.where(!McoSource.ETA_NUM.isin(specialHospitalCodes: _*)) } + /** Filter out shared stays (between hospitals). + * + * @return + */ def filterSharedHospitalStays: DataFrame = { val duplicateHospitalsFilter: Column = McoSource.SEJ_TYP.isNull or McoSource .SEJ_TYP =!= "B" or (McoSource.GRG_GHM.like("28%") and !McoSource.GRG_GHM @@ -17,14 +25,26 @@ private[data] class McoFilters(rawMco: DataFrame) { rawMco.filter(duplicateHospitalsFilter) } + /** Filter out induced abortion (IVG). + * + * @return + */ def filterIVG: DataFrame = { rawMco.filter(McoSource.GRG_GHM =!= "14Z08Z") } + /** Filter out non reimbursed stays. + * + * @return + */ def filterNonReimbursedStays: DataFrame = { rawMco.filter(McoSource.GHS_NUM =!= "9999") } + /** Filter out Mco corrupted stays as returned by the ATIH. + * + * @return + */ def filterMcoCorruptedHospitalStays: DataFrame = { val fictionalAndFalseHospitalStaysFilter: Column = !McoSource.GRG_GHM.like("90%") and McoSource .NIR_RET === "0" and McoSource.SEJ_RET === "0" and McoSource @@ -33,6 +53,10 @@ private[data] class McoFilters(rawMco: DataFrame) { rawMco.filter(fictionalAndFalseHospitalStaysFilter) } + /** Filter out McoCe corrupted stays as returned by the ATIH. + * + * @return + */ def filterMcoCeCorruptedHospitalStays: DataFrame = { val fictionalAndFalseHospitalStaysFilter: Column = McoCeSource.NIR_RET === "0" and McoCeSource .NAI_RET === "0" and McoCeSource.SEX_RET === "0" and McoCeSource @@ -43,17 +67,6 @@ private[data] class McoFilters(rawMco: DataFrame) { } private[data] object McoFilters { - - val specialHospitalCodes = List( - "130780521", "130783236", "130783293", "130784234", "130804297", "600100101", "690783154", - "690784137", "690784152", "690784178", "690787478", "750041543", "750100018", "750100042", - "750100075", "750100083", "750100091", "750100109", "750100125", "750100166", "750100208", - "750100216", "750100232", "750100273", "750100299", "750801441", "750803447", "750803454", - "830100558", "910100015", "910100023", "920100013", "920100021", "920100039", "920100047", - "920100054", "920100062", "930100011", "930100037", "930100045", "940100027", "940100035", - "940100043", "940100050", "940100068", "950100016" - ) - // radiotherapie & dialyse exceptions val GRG_GHMExceptions = List("28Z14Z", "28Z15Z", "28Z16Z") diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/SsrCeSource.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/SsrCeSource.scala index 7f415386..9a5ac54d 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/SsrCeSource.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/SsrCeSource.scala @@ -16,6 +16,7 @@ object SsrCeSource extends DataSourceManager with SsrSourceSanitizer { override def sanitize(ssrCe: DataFrame): DataFrame = { ssrCe + .filterSpecialHospitals .filterSsrCeCorruptedHospitalStays } } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/SsrFilters.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/SsrFilters.scala index 86e1650c..30cbb0f9 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/SsrFilters.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/SsrFilters.scala @@ -1,11 +1,15 @@ package fr.polytechnique.cmap.cnam.etl.sources.data import org.apache.spark.sql.{Column, DataFrame} +import fr.polytechnique.cmap.cnam.etl.sources.data.DoublonFinessPmsi.specialHospitalCodes private[data] class SsrFilters(rawSsr: DataFrame) { - + /** Filter out Ssr corrupted stays as returned by the ATIH. + * + * @return + */ def filterSsrCorruptedHospitalStays: DataFrame = { - val fictionalAndFalseHospitalStaysFilter: Column = SsrSource + val fictionalAndFalseHospitalStaysFilter: Column = !SsrSource.GRG_GME.like("90%") and SsrSource .NIR_RET === "0" and SsrSource.SEJ_RET === "0" and SsrSource .FHO_RET === "0" and SsrSource.PMS_RET === "0" and SsrSource .DAT_RET === "0" @@ -13,6 +17,10 @@ private[data] class SsrFilters(rawSsr: DataFrame) { rawSsr.filter(fictionalAndFalseHospitalStaysFilter) } + /** Filter out SsrCe corrupted stays as returned by the ATIH. + * + * @return + */ def filterSsrCeCorruptedHospitalStays: DataFrame = { val fictionalAndFalseHospitalStaysFilter: Column = SsrCeSource.NIR_RET === "0" and SsrCeSource .NAI_RET === "0" and SsrCeSource.SEX_RET === "0" and SsrCeSource @@ -20,6 +28,14 @@ private[data] class SsrFilters(rawSsr: DataFrame) { rawSsr.filter(fictionalAndFalseHospitalStaysFilter) } + + /** Filter out Finess doublons. + * + * @return + */ + def filterSpecialHospitals: DataFrame = { + rawSsr.where(!SsrSource.ETA_NUM.isin(specialHospitalCodes: _*)) + } } private[data] object SsrFilters \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/SsrSource.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/SsrSource.scala index 787f074a..eeb2eb89 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/SsrSource.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/data/SsrSource.scala @@ -1,9 +1,7 @@ package fr.polytechnique.cmap.cnam.etl.sources.data import org.apache.spark.sql.functions.col -import org.apache.spark.sql.{Column, DataFrame, SQLContext} -import org.apache.spark.sql.functions.to_date -import org.apache.spark.sql.functions.year +import org.apache.spark.sql.{Column, DataFrame} /** * Extractor class for the SSR table @@ -15,22 +13,22 @@ object SsrSource extends DataSourceManager with SsrSourceSanitizer { val ETA_NUM: Column = col("ETA_NUM") val RHA_NUM: Column = col("RHA_NUM") val RHS_NUM: Column = col("RHS_NUM") - val MOR_PRP: Column = col("MOR_PRP") - val ETL_AFF: Column = col("ETL_AFF") - val MOI_ANN_SOR_SEJ: Column = col("MOI_ANN_SOR_SEJ") - val RHS_ANT_SEJ_ENT: Column = col("RHS_ANT_SEJ_ENT") - val FP_PEC: Column = col("FP_PEC") - - val NIR_RET: Column = col("SSR_C__NIR_RET") - val SEJ_RET: Column = col("SSR_C__SEJ_RET") - val FHO_RET: Column = col("SSR_C__FHO_RET") - val PMS_RET: Column = col("SSR_C__PMS_RET") - val DAT_RET: Column = col("SSR_C__DAT_RET") - val ENT_DAT: Column = col("SSR_C__ENT_DAT") - val SOR_DAT: Column = col("SSR_C__SOR_DAT") + val MOR_PRP: Column = col("SSR_B__MOR_PRP") + val ETL_AFF: Column = col("SSR_B__ETL_AFF") + val MOI_ANN_SOR_SEJ: Column = col("SSR_B__MOI_ANN_SOR_SEJ") + val RHS_ANT_SEJ_ENT: Column = col("SSR_B__RHS_ANT_SEJ_ENT") + val FP_PEC: Column = col("SSR_B__FP_PEC") + val GRG_GME: Column = col("SSR_B__GRG_GME") + val NIR_RET: Column = col("NIR_RET") + val SEJ_RET: Column = col("SEJ_RET") + val FHO_RET: Column = col("FHO_RET") + val PMS_RET: Column = col("PMS_RET") + val DAT_RET: Column = col("DAT_RET") + val ENT_DAT: Column = col("ENT_DAT") + val SOR_DAT: Column = col("SOR_DAT") val Year: Column = col("year") - override val EXE_SOI_DTD: Column = col("SSR_C__EXE_SOI_DTD") + override val EXE_SOI_DTD: Column = col("EXE_SOI_DTD") val foreignKeys: List[String] = List("ETA_NUM", "RHA_NUM", "year") @@ -40,28 +38,7 @@ object SsrSource extends DataSourceManager with SsrSourceSanitizer { * https://datainitiative.atlassian.net/wiki/pages/viewpage.action?pageId=40304642 */ rawSsr + .filterSpecialHospitals .filterSsrCorruptedHospitalStays } - - def read(sqlContext: SQLContext, path: List[String]): DataFrame = { - readAnnotateJoin(sqlContext, path, "SSR_C") - } - - private def readAnnotateJoin(sqlContext: SQLContext, paths: List[String], joinedTableName: String): DataFrame = { - val ssrSej = sqlContext.read.parquet(paths.head) - val ssrC = sqlContext.read.parquet(paths(1)) - ssrSej.join( - ssrC.addPrefixYear(joinedTableName, foreignKeys), foreignKeys, "left_outer") - } - - implicit class TableHelper(df: DataFrame) { - - def addPrefixYear(prefix: String, except: List[String]): DataFrame = { - val renamedColumns = df.columns.map { - case colName if !except.contains(colName) => prefix + "__" + colName - case keyCol => keyCol - } - df.toDF(renamedColumns: _*).withColumn("year", year(to_date(col("SSR_C__EXE_SOI_DTD")))) - } - } } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/value/IrNatSource.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/value/IrNatSource.scala new file mode 100644 index 00000000..20743b99 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/sources/value/IrNatSource.scala @@ -0,0 +1,8 @@ + +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.sources.value + +import fr.polytechnique.cmap.cnam.etl.sources.SourceManager + +object IrNatSource extends SourceManager diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/drugprescription/DrugPrescriptionTransformer.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/drugprescription/DrugPrescriptionTransformer.scala new file mode 100644 index 00000000..d311248f --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/drugprescription/DrugPrescriptionTransformer.scala @@ -0,0 +1,38 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.transformers.drugprescription + +import org.apache.spark.sql.Dataset +import fr.polytechnique.cmap.cnam.etl.events.{Drug, DrugPrescription, Event} + +class DrugPrescriptionTransformer extends Serializable { + /** + * Transform DrugPurchases Events to DrugPrescription Events. + * @param drugs [[Dataset]][[Event]][[Drug]] + * @return [[Dataset]][[Event]][[DrugPrescription]] + */ + def transform(drugs: Dataset[Event[Drug]]): Dataset[Event[DrugPrescription]] = { + + val sqlCtx = drugs.sqlContext + import sqlCtx.implicits._ + drugs + .groupByKey(drug => (drug.groupID, drug.patientID, drug.start)) + .mapGroups((_, drugs) => fromDrugs(drugs.toList)) + .distinct() + } + + /** + * Combines [[Drug]] [[Event]] to form an [[Event]] of type [[DrugPrescription]]. + * WARNING: Drug Events must share the same patientID, groupID and start. + * @param drugs Events to be combined. Must share the same patientID, groupID and start. + * @return DrugPrescription Event which value is concatenation of the values of the passed Drugs. + */ + def fromDrugs(drugs: List[Event[Drug]]): Event[DrugPrescription] = { + val first = drugs.head + val value = drugs + .map(_.value) + .sorted + .reduce((l, r) => l.concat("_").concat(r)) + DrugPrescription(first.patientID, value, first.weight, first.groupID, first.start) + } +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/exposures/ExposureDuration.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/exposures/ExposureDuration.scala index 3f822766..b06e594f 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/exposures/ExposureDuration.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/exposures/ExposureDuration.scala @@ -48,6 +48,8 @@ case class ExposureDuration(patientID: String, value: String, period: Period, sp RightRemainingPeriod(ExposureDuration(self.patientID, self.value, p2, self.span)) ) } + // avoid scala match may not be exhaustive + case _ => NullRemainingPeriod } } } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/follow_up/Columns.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/follow_up/Columns.scala index 882a8d9a..8fc941d4 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/follow_up/Columns.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/follow_up/Columns.scala @@ -27,8 +27,4 @@ private[follow_up] object Columns { final val TracklossDate = "trackloss" final val FirstTargetDiseaseDate = "firstTargetDisease" - object EndReasons extends Enumeration { - val Death, Disease, Trackloss, ObservationEnd = Value - } - } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/follow_up/FollowUpTransformer.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/follow_up/FollowUpTransformer.scala index e7f8da14..0957c8cf 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/follow_up/FollowUpTransformer.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/follow_up/FollowUpTransformer.scala @@ -4,15 +4,30 @@ package fr.polytechnique.cmap.cnam.etl.transformers.follow_up import java.sql.Timestamp import scala.util.Try -import org.apache.spark.sql.functions._ import org.apache.spark.sql.Dataset +import org.apache.spark.sql.functions._ import fr.polytechnique.cmap.cnam.etl.events._ import fr.polytechnique.cmap.cnam.etl.patients.Patient - +/** It allows create a followUp dataset using the dataset of + * Dataset[(Patient, Event[ObservationPeriod])], + * Dataset[Event[Molecule]] ( This dataset is not used in the treatment and it should be removed in further versions), + * Dataset[Event[Outcome]]( This dataset is not used in the treatment and it should be removed in further versions) and + * Dataset[Event[Trackloss]]. + * + * @param config A config object tha contains the need values to set the parameters of study. + */ class FollowUpTransformer(config: FollowUpTransformerConfig) { - + /** The main method in this transformation class, It combines multiple basic Events to form a FollowUp Dataset. + * + * @param patients A dataset that contains a dataset of [[fr.polytechnique.cmap.cnam.etl.patients.Patient]] joined with + * a dataset of [[fr.polytechnique.cmap.cnam.etl.events.ObservationPeriod]]. + * @param dispensations A dataset of [[fr.polytechnique.cmap.cnam.etl.events.Molecule]]. + * @param outcomes A dataset of [[fr.polytechnique.cmap.cnam.etl.events.Outcome]]. + * @param tracklosses A dataset of [[fr.polytechnique.cmap.cnam.etl.events.Trackloss]]. + * @return A dataset of Event[FollowUp] type ([[fr.polytechnique.cmap.cnam.etl.events.FollowUp]]). + */ def transform( patients: Dataset[(Patient, Event[ObservationPeriod])], dispensations: Dataset[Event[Molecule]], @@ -20,12 +35,18 @@ class FollowUpTransformer(config: FollowUpTransformerConfig) { tracklosses: Dataset[Event[Trackloss]]): Dataset[Event[FollowUp]] = { import patients.sparkSession.implicits._ - import FollowUpTransformerUtilities._ import Columns._ + import FollowUpTransformerUtilities._ val delayMonths = config.delayMonths + /** It takes the Dataset[(Patient, Event[ObservationPeriod])] and perform several transformations: + * 1. Extract th patientId value. + * 2. Correct the observationPeriod start date plus delayMonth value comparing to observationPeriod end date. + * 3. Calculate the min of dates. + * 4 Return a PatientDates dataset [[fr.polytechnique.cmap.cnam.etl.transformers.follow_up.FollowUpTransformerUtilities.PatientDates]]. + */ val patientDates: Dataset[PatientDates] = patients .map { e => PatientDates( @@ -40,17 +61,23 @@ class FollowUpTransformer(config: FollowUpTransformerConfig) { .agg( min(DeathDate).as(DeathDate), min(FollowUpStart).as(FollowUpStart), - min(ObservationEnd).as(ObservationEnd) + min(Columns.ObservationEnd).as(Columns.ObservationEnd) ) .map( e => PatientDates( e.getAs[String](PatientID), Option(e.getAs[Timestamp](DeathDate)), Option(e.getAs[Timestamp](FollowUpStart)), - Option(e.getAs[Timestamp](ObservationEnd)) + Option(e.getAs[Timestamp](Columns.ObservationEnd)) ) ) + /** It takes patientDates dataset and join with tracklosses dataset performing the algorithm as follow: + * 1. Extract the patientId value and correct the trackloss date comparing with followUpStart date. + * 2. Filter corrected empty dates. + * 3. Extract the min of tracklossDate. + * 4. Return a TrackLossDate [[fr.polytechnique.cmap.cnam.etl.transformers.follow_up.FollowUpTransformerUtilities.TrackLossDate]]. + */ val tracklossDates: Dataset[TrackLossDate] = patientDates .joinWith(tracklosses, tracklosses.col(PatientID) === patientDates.col(PatientID)) .map(e => TrackLossDate(e._2.patientID, tracklossDateCorrected(e._2.start, e._1.followUpStart.get))) @@ -61,30 +88,25 @@ class FollowUpTransformer(config: FollowUpTransformerConfig) { ) .map(e => TrackLossDate(e.getAs[String](PatientID), Option(e.getAs[Timestamp](TracklossDate)))) - val disease = config.outcomeName.getOrElse(None).toString - - val outcomesDisease: Dataset[Event[Outcome]] = outcomes - .filter(e => e.value.matches(s".*$disease.*")) - .groupBy(col(PatientID)) - .agg( - min(Start).as(Start) - ).map(e => Outcome(e.getAs[String](PatientID), disease, e.getAs[Timestamp](Start))) - + /** It joins patientDates dataset with tracklossDates dataset carrying out the following algorithm : + * 1. Retrieve the trackloss date if exist, None otherwise. + * 2. Using the death date, the trackloss date and the observation end date it calculates through + * [[fr.polytechnique.cmap.cnam.etl.transformers.follow_up.FollowUpTransformerUtilities.endReason]] the follow up's + * end date and reason. + * 3. It filters the empty followUp end dates. + * 4. Return a FollowUp [[fr.polytechnique.cmap.cnam.etl.events.FollowUp]]. + */ patientDates .joinWith(tracklossDates, tracklossDates.col(PatientID) === patientDates.col(PatientID), "left_outer") - .joinWith(outcomesDisease, col(PatientID) === col(s"_1.$PatientID"), "left_outer") .map { e => - val trackloss: Option[Timestamp] = Try(e._1._2.trackloss).getOrElse(None) - val disease: Option[Timestamp] = Try(Option(e._2.start)).getOrElse(None) + val trackloss: Option[Timestamp] = Try(e._2.trackloss).getOrElse(None) val followUpEndReason = endReason( - DeathReason(date = e._1._1.deathDate), - DiseaseReason(date = disease), + DeathReason(date = e._1.deathDate), TrackLossReason(date = trackloss), - ObservationEndReason(date = e._1._1.observationEnd) + ObservationEndReason(date = e._1.observationEnd) ) - FollowUp(e._1._1.patientID, followUpEndReason.reason, e._1._1.followUpStart.get, followUpEndReason.date.get) + FollowUp(e._1.patientID, followUpEndReason.reason, e._1.followUpStart.get, followUpEndReason.date.get) }.filter(e => e.end.nonEmpty) - } } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/follow_up/FollowUpTransformerUtilities.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/follow_up/FollowUpTransformerUtilities.scala index 143c4145..63fd76d5 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/follow_up/FollowUpTransformerUtilities.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/follow_up/FollowUpTransformerUtilities.scala @@ -1,93 +1,163 @@ +// License: BSD 3 clause + package fr.polytechnique.cmap.cnam.etl.transformers.follow_up import java.sql.Timestamp -import fr.polytechnique.cmap.cnam.etl.transformers.follow_up.Columns.EndReasons import fr.polytechnique.cmap.cnam.util.datetime.implicits.addMonthsToRichTimestamp - +/** Factory for FollowUp utilities. */ object FollowUpTransformerUtilities { - - case class PatientDates( + /** It saves patients with need dates. + * + * @param patientID The value patientID from dataset. + * @param deathDate The value deathDate from dataset. + * @param followUpStart The value followUpStart from dataset. + * @param observationEnd The value observationEnd from dataset. + */ + private[follow_up] case class PatientDates( patientID: String, deathDate: Option[Timestamp], followUpStart: Option[Timestamp], observationEnd: Option[Timestamp]) - - case class TrackLossDate( + /** It saves patients with their trackloss date. + * + * @param patientID The value patientID from dataset. + * @param trackloss The value trackloss from dataset. + */ + private[follow_up] case class TrackLossDate( patientID: String, trackloss: Option[Timestamp]) - case class FollowUpEnd(reason: String, date: Option[Timestamp]) + /** It saves the follow up's end's reason and its date. + * + * @param reason A string value of the reason of the end. + * @param date The date of the end of follow up. + */ + private[follow_up] case class FollowUpEnd(reason: String, date: Option[Timestamp]) + + /** It stores the list of reasons. */ + private[follow_up] sealed trait EndReason extends Enumeration { + val Death, Trackloss, ObservationEnd = Value + val endReason: String + } + + /** It's an object to store Death as endReason. */ + private[follow_up] case object Death extends EndReason { + val endReason = Death + .toString + } - abstract sealed class FollowUpEndReason { - val reason: String + /** It's an object to store Trackloss as endReason. */ + private[follow_up] case object Trackloss extends EndReason { + val endReason = Trackloss + .toString + } + + /** It's an object to store ObservationEnd as endReason. */ + private[follow_up] case object ObservationEnd extends EndReason { + val endReason = ObservationEnd + .toString + } + + /** It's need to store the compare method + * which allow use the min function with + * the FollowUpEndReason class type. + * + * @param endReason An object of type EndReason. + */ + private[follow_up] abstract sealed class FollowUpEndReason(val endReason: EndReason) { val date: Option[Timestamp] + /** It takes a FollowUpEndReason type class and compare the dates, + * if they are the same, then compare the end reasons to return + * the correct according to priority. + * If two reasons are the same date, + * Death is the first option, if Death is not present, the second one is Trackloss + * and the third option is ObservationEnd. + * + * @param that The class of FollowUpEndReason type. + * @return The correct FollowUpEndReason class. + */ def compare(that: FollowUpEndReason): Int = { (this.date.get compareTo that.date.get) match { - case 0 => (this.reason, that.reason) match { + case 0 => (this.endReason.endReason, that.endReason.endReason) match { case ("Death", _) => 1 case (_, "Death") => -1 - case ("Disease", "ObservationEnd") => 1 - case ("ObservationEnd", "Disease") => -1 - case (_, _) => 1 + case ("Trackloss", _) => 1 + case (_, "Trackloss") => -1 } case c => c } } } + /** It stores the implicit Ordering needed to use the min function in FollowUpEndReason types class. */ object FollowUpEndReason { - implicit def ord[A <: FollowUpEndReason]: Ordering[A] = Ordering.by((_: A).date.get) - - implicit def ordered: Ordering[Timestamp] = new Ordering[Timestamp] { - def compare(x: Timestamp, y: Timestamp): Int = x compareTo y - } + import fr.polytechnique.cmap.cnam.util.datetime.implicits.ordered + /** Implicit ordering for the timestamps in FollowUpEndReason type case class. + * + * The filter to avoid empty dates its mandatory. + * Example: Seq(death, disease, trackloss, observation).filter(e => e.date.nonEmpty).min + * + */ + implicit def ord[A <: FollowUpEndReason]: Ordering[A] = Ordering.by((_: A).date.get) } + /** It stores death reason and its date. + * + * @param date The value deathDate from dataset. + */ case class DeathReason( - reason: String = EndReasons.Death.toString, - date: Option[Timestamp]) extends FollowUpEndReason with Ordered[FollowUpEndReason] - - case class DiseaseReason( - reason: String = EndReasons.Disease.toString, - date: Option[Timestamp]) extends FollowUpEndReason with Ordered[FollowUpEndReason] + date: Option[Timestamp]) extends FollowUpEndReason(Death) with Ordered[FollowUpEndReason] + /** It stores trackloss reason and its date. + * + * @param date The value trackloss from dataset TrackLossDate. + */ case class TrackLossReason( - reason: String = EndReasons.Trackloss.toString, - date: Option[Timestamp]) extends FollowUpEndReason with Ordered[FollowUpEndReason] + date: Option[Timestamp]) extends FollowUpEndReason(Trackloss) with Ordered[FollowUpEndReason] + /** It stores observation end reason and its date. + * + * @param date The value observationEnd from dataset PatientDates. + */ case class ObservationEndReason( - reason: String = EndReasons.ObservationEnd.toString, - date: Option[Timestamp]) extends FollowUpEndReason with Ordered[FollowUpEndReason] - + date: Option[Timestamp]) extends FollowUpEndReason(ObservationEnd) with Ordered[FollowUpEndReason] + /** It returns the date later add delayMonths value from + * config passes through FollowUpTransformer class. + */ val correctedStart: (Timestamp, Option[Timestamp], Int) => Option[Timestamp] = (start: Timestamp, end: Option[Timestamp], delayMonths: Int) => { val st: Timestamp = addMonthsToRichTimestamp(delayMonths, start) if (st.before(end.get)) Some(st) else None - } + /** It returns start date when after follow Up Start otherwise None. */ val tracklossDateCorrected: (Timestamp, Timestamp) => Option[Timestamp] = (start: Timestamp, followUpStart: Timestamp) => { if (start.after(followUpStart)) Some(start) else None } + /** It takes all FollowUpEndReason type class and return the min of them. + * + * @param death A DeathReason class. + * @param trackloss A TrackLossReason class. + * @param observation A ObservationEndReason class. + * @return FollowUpEnd class. + */ def endReason( death: DeathReason, - disease: DiseaseReason, trackloss: TrackLossReason, observation: ObservationEndReason): FollowUpEnd = { - val followUpEndReason = Seq(death, disease, trackloss, observation).filter(e => e.date.nonEmpty).min - FollowUpEnd(followUpEndReason.reason, followUpEndReason.date) + val followUpEndReason = Seq(death, trackloss, observation).filter(e => e.date.nonEmpty).min + FollowUpEnd(followUpEndReason.endReason.endReason, followUpEndReason.date) } - } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/interaction/ExposureN.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/interaction/ExposureN.scala index 7c71151a..3a2aa524 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/interaction/ExposureN.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/interaction/ExposureN.scala @@ -3,12 +3,19 @@ package fr.polytechnique.cmap.cnam.etl.transformers.interaction import cats.syntax.functor._ -import fr.polytechnique.cmap.cnam.etl.datatypes.{NullRemainingPeriod, Period, Subtractable, RemainingPeriod} +import fr.polytechnique.cmap.cnam.etl.datatypes.{NullRemainingPeriod, Period, RemainingPeriod, Subtractable} import fr.polytechnique.cmap.cnam.etl.events.{Event, Interaction} case class ExposureN(patientID: String, values: Set[String], period: Period) extends Subtractable[ExposureN] { self => + /** + * Returns duration of this ExposureN in milliseconds + * + * @return duration in millisecond as Long + */ + def toDuration: Long = self.period.end.getTime - self.period.start.getTime + def intersect(other: ExposureN): Option[ExposureN] = { if (self.patientID.equals(other.patientID) && self.values.intersect(other.values).isEmpty) { diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/interaction/InteractionTransformerConfig.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/interaction/InteractionTransformerConfig.scala index 9615cbf3..af83a417 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/interaction/InteractionTransformerConfig.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/interaction/InteractionTransformerConfig.scala @@ -2,10 +2,17 @@ package fr.polytechnique.cmap.cnam.etl.transformers.interaction +import me.danielpes.spark.datetime.{Period => Duration} +import me.danielpes.spark.datetime.implicits._ import fr.polytechnique.cmap.cnam.etl.transformers.TransformerConfig -class InteractionTransformerConfig(val level: Int) extends TransformerConfig +class InteractionTransformerConfig(val level: Int, val minimumDuration: Duration) extends TransformerConfig object InteractionTransformerConfig { - def apply(level: Int = 3): InteractionTransformerConfig = new InteractionTransformerConfig(level) + def apply( + level: Int = 3, + minimumDuration: Duration = 30.days): InteractionTransformerConfig = new InteractionTransformerConfig( + level, + minimumDuration + ) } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/interaction/NLevelInteractionTransformer.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/interaction/NLevelInteractionTransformer.scala index 1ffff493..7b495d5a 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/interaction/NLevelInteractionTransformer.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/interaction/NLevelInteractionTransformer.scala @@ -2,7 +2,7 @@ package fr.polytechnique.cmap.cnam.etl.transformers.interaction -import org.apache.spark.sql.{Dataset, functions} +import org.apache.spark.sql.{functions, Dataset} import fr.polytechnique.cmap.cnam.etl.datatypes._ import fr.polytechnique.cmap.cnam.etl.events.{Event, Exposure, Interaction} import fr.polytechnique.cmap.cnam.util.functions._ @@ -13,12 +13,14 @@ case class NLevelInteractionTransformer(config: InteractionTransformerConfig) ex def joinTwoExposureNDataSet(right: Dataset[ExposureN], left: Dataset[ExposureN]): Dataset[ExposureN] = { val sqlCtx = right.sqlContext import sqlCtx.implicits._ + val minimumDuration = config.minimumDuration.totalMilliseconds right .joinWith( left, left(Event.Columns.PatientID) === right(Event.Columns.PatientID) && !left("values").geq(right("values")) ) .flatMap(e => e._1.intersect(e._2)) + .filter(i => i.toDuration >= minimumDuration) .repartition(functions.col("patientID"), functions.col("values")) .cache() } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/observation/Columns.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/observation/Columns.scala index 2eb75959..849c97fb 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/observation/Columns.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/observation/Columns.scala @@ -2,7 +2,10 @@ package fr.polytechnique.cmap.cnam.etl.transformers.observation import fr.polytechnique.cmap.cnam.etl.events.Event -private[observation] object Columns { +/** Private object for the package [[fr.polytechnique.cmap.cnam.etl.transformers.observation]] + * to retrieve the columns of event object of type ObservationPeriod. + * */ +private[observation] object Columns { final val PatientID = Event.Columns.PatientID final val Start = Event.Columns.Start diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/observation/ObservationPeriodTransformer.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/observation/ObservationPeriodTransformer.scala index a25544b5..0c389608 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/observation/ObservationPeriodTransformer.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/observation/ObservationPeriodTransformer.scala @@ -3,15 +3,26 @@ package fr.polytechnique.cmap.cnam.etl.transformers.observation import java.sql.Timestamp -import org.apache.spark.sql.functions._ import org.apache.spark.sql.Dataset +import org.apache.spark.sql.functions._ import fr.polytechnique.cmap.cnam.etl.events.{AnyEvent, Event, Molecule, ObservationPeriod} import fr.polytechnique.cmap.cnam.util.datetime.implicits._ + +/** It allows create a observationPeriod dataset using a dataset of type + * [[fr.polytechnique.cmap.cnam.etl.events.AnyEvent]]. + * + * @param config A config file that contains the need values to create the ObservationPeriod transformer. + */ class ObservationPeriodTransformer(config: ObservationPeriodTransformerConfig) { import Columns._ + /** The main method in this transformation class, it allows the transformation into an ObservationPeriod dataset. + * + * @param events A dataset of [[fr.polytechnique.cmap.cnam.etl.events.AnyEvent]]. + * @return A dataset of Event[ObservationPeriod] type ([[fr.polytechnique.cmap.cnam.etl.events.ObservationPeriod]]). + */ def transform(events: Dataset[Event[AnyEvent]]): Dataset[Event[ObservationPeriod]] = { val studyStart: Timestamp = config.studyStart @@ -20,6 +31,11 @@ class ObservationPeriodTransformer(config: ObservationPeriodTransformerConfig) { import events.sqlContext.implicits._ + /** It takes the events dataset and apply the following algorithm: + * 1. Filter by category equal to ''molecule'' and start date before of study start date. + * 2. Calculate the min of start date. + * 3. Return an ObservationPeriod [[fr.polytechnique.cmap.cnam.etl.events.ObservationPeriod]]. + */ events.filter( e => e.category == Molecule.category && (e.start .compareTo(studyStart) >= 0) @@ -32,8 +48,6 @@ class ObservationPeriodTransformer(config: ObservationPeriodTransformerConfig) { e.getAs[Timestamp](Start), studyEnd ) - ) - } } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/patients/PatientFilters.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/patients/PatientFilters.scala new file mode 100644 index 00000000..989e810e --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/patients/PatientFilters.scala @@ -0,0 +1,76 @@ +package fr.polytechnique.cmap.cnam.etl.transformers.patients + +import java.sql.Timestamp +import java.time.Period +import org.apache.spark.sql.Dataset +import fr.polytechnique.cmap.cnam.etl.extractors.patients.PatientsConfig +import fr.polytechnique.cmap.cnam.etl.patients.Patient +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +class PatientFilters(config: PatientsConfig) extends Serializable { + + /** Filter Patient with config + * + * @param patients the patient to check. + * @return [[Dataset]] of Patient + */ + def filterPatients(patients: Dataset[Patient]): Dataset[Patient] = { + import patients.sparkSession.implicits._ + patients + .flatMap(controlAge(config.minAge, config.maxAge)) + .flatMap(controlGender(config.minGender, config.maxGender)) + .flatMap(controlDeathDate(config.minYear, config.maxYear)) + } + + /** Returns None if Patient is old before minAge or after maxAge, otherwise returns Some(Patient) + * + * @param patient the patient to check. + * @param minAge the minimum year to control with. + * @param maxAge the maximum year to control with. + * @return an Option of Patient + */ + protected[patients] def controlAge(minAge: Int, maxAge: Int)(patient: Patient): TraversableOnce[Patient] = { + val ageReferenceDate: Timestamp = Timestamp.valueOf(config.ageReferenceDate.atStartOfDay()) + if(patient.birthDate != null) { + val age = Period.between(patient.birthDate.toLocalDateTime.toLocalDate, ageReferenceDate.toLocalDateTime.toLocalDate).getYears + if (age >= config.minAge && age < config.maxAge) { + Some(patient) + } else { + None + } + } + else { + None + } + } + + /** Returns None if Patient is died before minYear or after maxYear, otherwise returns Some(Patient) + * + * @param patient the patient to check. + * @param minYear the minimum year to control with. + * @param maxYear the maximum year to control with. + * @return an Option of Patient + */ + protected[patients] def controlDeathDate(minYear: Int, maxYear: Int)(patient: Patient): TraversableOnce[Patient] = { + if (patient.deathDate.isEmpty || (patient.deathDate.get.after(makeTS(minYear, 1, 1)) && patient.deathDate.get.before(makeTS(maxYear, 1, 1)))) { + Some(patient) + } else { + None + } + } + + /** Returns None if Patient has a unknown Gender, otherwise returns Some(Patient) + * + * @param patient the patient to check. + * @param minGender the minimum gender to control with. + * @param maxGender the maximum gender to control with. + * @return an Option of Patient + */ + protected[patients] def controlGender(minGender: Int, maxGender: Int)(patient: Patient): TraversableOnce[Patient] = { + if (patient.gender >= minGender && patient.gender <= maxGender) { + Some(patient) + } else { + None + } + } +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/tracklosses/TracklossTransformer.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/tracklosses/TracklossTransformer.scala new file mode 100644 index 00000000..d6d6ef50 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/tracklosses/TracklossTransformer.scala @@ -0,0 +1,37 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.transformers.tracklosses + +import me.danielpes.spark.datetime.implicits._ +import org.apache.spark.sql.Dataset +import fr.polytechnique.cmap.cnam.etl.events.{Dispensation, Event, Trackloss} +import fr.polytechnique.cmap.cnam.util.functions._ + +class TracklossTransformer(config: TracklossesConfig) extends Serializable { + + def transform[T <: Dispensation](drugs: Dataset[Event[T]]): Dataset[Event[Trackloss]] = { + + val sqlCtx = drugs.sqlContext + import sqlCtx.implicits._ + + drugs.groupByKey(_.patientID).flatMapGroups((_, events) => fromDispensationToTrackloss(events)).distinct() + + } + + private def fromDispensationToTrackloss(events: Iterator[Event[Dispensation]]): TraversableOnce[Event[Trackloss]] = { + + val sortedEvents = events.toList.sortBy(_.start) + val lastEvent: Event[Dispensation] = sortedEvents.last.copy(start = config.studyEnd) + + val addMonthDelay = (event: Event[Dispensation]) => Trackloss(event.patientID, (event.start + config.tracklossMonthDelay).get) + + (sortedEvents :+ lastEvent).toStream.sliding(2, 1).filter(isInSlide).map(slide => addMonthDelay(slide.head)) + } + + private def isInSlide(slide: Stream[Event[Dispensation]]): Boolean = { + val reachTS = (slide.head.start + config.emptyMonths).get + slide.last.start.after(reachTS) || slide.last.start.equals(reachTS) + } + + +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/tracklosses/TracklossesConfig.scala b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/tracklosses/TracklossesConfig.scala new file mode 100644 index 00000000..9e1a7f32 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/etl/transformers/tracklosses/TracklossesConfig.scala @@ -0,0 +1,13 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.transformers.tracklosses + +import java.sql.Timestamp +import me.danielpes.spark.datetime.Period +import me.danielpes.spark.datetime.implicits._ +import fr.polytechnique.cmap.cnam.etl.config.CaseClassConfig + +case class TracklossesConfig( + studyEnd: Timestamp, + emptyMonths: Period = 4.months, + tracklossMonthDelay: Period = 2.months) extends CaseClassConfig diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/BulkConfig.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/BulkConfig.scala index bc876f39..48470be2 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/BulkConfig.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/BulkConfig.scala @@ -4,34 +4,35 @@ package fr.polytechnique.cmap.cnam.study.bulk import java.time.LocalDate import pureconfig.generic.auto._ +import fr.polytechnique.cmap.cnam.etl.config.BaseConfig import fr.polytechnique.cmap.cnam.etl.config.study.StudyConfig -import fr.polytechnique.cmap.cnam.etl.config.{BaseConfig, ConfigLoader} -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.DrugConfig -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.level.Cip13Level +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.DrugConfig +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.DrugClassConfig +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level.{Cip13Level, DrugClassificationLevel} case class BulkConfig( input: StudyConfig.InputPaths, - output: StudyConfig.OutputPaths) extends StudyConfig { - val drugs: DrugConfig = BulkConfig.DrugsConfig + output: StudyConfig.OutputPaths, + drugs: DrugConfig = BulkConfig.DrugsConfig()) extends StudyConfig { val base: BaseConfig = BulkConfig.BaseConfig } -object BulkConfig extends ConfigLoader { +object BulkConfig extends BulkConfigLoader { def load(path: String, env: String): BulkConfig = { val defaultPath = "config/bulk/default.conf" loadConfigWithDefaults[BulkConfig](path, defaultPath, env) } + final case class DrugsConfig( + override val level: DrugClassificationLevel = Cip13Level, + override val families: List[DrugClassConfig] = List.empty + ) extends DrugConfig(level = level, families = families) + final object BaseConfig extends BaseConfig( ageReferenceDate = LocalDate.of(2011, 1, 1), studyStart = LocalDate.of(2010, 1, 1), studyEnd = LocalDate.of(2015, 1, 1) ) - final object DrugsConfig extends DrugConfig( - level = Cip13Level, - families = List.empty - ) - } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/BulkConfigLoader.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/BulkConfigLoader.scala new file mode 100644 index 00000000..d194a5d4 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/BulkConfigLoader.scala @@ -0,0 +1,16 @@ +package fr.polytechnique.cmap.cnam.study.bulk + +import pureconfig.ConfigReader +import fr.polytechnique.cmap.cnam.etl.config.ConfigLoader +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.DrugClassConfig +import fr.polytechnique.cmap.cnam.study.fall.config.FallDrugClassConfig + +class BulkConfigLoader extends ConfigLoader { + + //For reading DrugConfigClasses that are related to the Fall study + implicit val drugConfigReader: ConfigReader[DrugClassConfig] = ConfigReader[String].map( + family => + FallDrugClassConfig.familyFromString(family) + ) + +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/BulkMain.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/BulkMain.scala index 158966d2..ab929e44 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/BulkMain.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/BulkMain.scala @@ -3,19 +3,12 @@ package fr.polytechnique.cmap.cnam.study.bulk import java.io.PrintWriter -import scala.collection.mutable import org.apache.spark.sql.{Dataset, SQLContext} import fr.polytechnique.cmap.cnam.Main -import fr.polytechnique.cmap.cnam.etl.extractors.acts.{DcirMedicalActExtractor, McoCcamActExtractor, McoCeActExtractor, McoCimMedicalActExtractor} -import fr.polytechnique.cmap.cnam.etl.extractors.classifications.GhmExtractor -import fr.polytechnique.cmap.cnam.etl.extractors.diagnoses._ -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.DrugExtractor -import fr.polytechnique.cmap.cnam.etl.extractors.hospitalstays.McoHospitalStaysExtractor -import fr.polytechnique.cmap.cnam.etl.extractors.patients.{Patients, PatientsConfig} import fr.polytechnique.cmap.cnam.etl.implicits import fr.polytechnique.cmap.cnam.etl.sources.Sources -import fr.polytechnique.cmap.cnam.util.Path -import fr.polytechnique.cmap.cnam.util.reporting.{MainMetadata, OperationMetadata, OperationReporter, OperationTypes} +import fr.polytechnique.cmap.cnam.study.bulk.extractors._ +import fr.polytechnique.cmap.cnam.util.reporting.MainMetadata object BulkMain extends Main { override def appName: String = "BulkMain" @@ -24,7 +17,6 @@ object BulkMain extends Main { sqlContext: SQLContext, argsMap: Map[String, String]): Option[Dataset[_]] = { - val format = new java.text.SimpleDateFormat("yyyy_MM_dd_HH_mm_ss") val startTimestamp = new java.util.Date() val bulkConfig = BulkConfig.load(argsMap("conf"), argsMap("env")) @@ -32,187 +24,21 @@ object BulkMain extends Main { import implicits.SourceReader val sources = Sources.sanitize(sqlContext.readSources(bulkConfig.input)) - val operationsMetadata = mutable.Buffer[OperationMetadata]() - - val drugs = new DrugExtractor(bulkConfig.drugs).extract(sources, Set.empty).cache() - - operationsMetadata += { - OperationReporter.report( - "DrugPurchases", - List("DCIR"), - OperationTypes.Dispensations, - drugs.toDF, - Path(bulkConfig.output.outputSavePath), - bulkConfig.output.saveMode - ) - } - - drugs.unpersist() - - val hospitalStays = McoHospitalStaysExtractor.extract(sources, Set.empty).cache() - operationsMetadata += { - OperationReporter - .report( - "HospitalStays", - List("MCO"), - OperationTypes.HospitalStays, - hospitalStays.toDF, - Path(bulkConfig.output.outputSavePath), - bulkConfig.output.saveMode - ) - } - - val dcirMedicalAct = DcirMedicalActExtractor.extract(sources, Set.empty).cache() - - operationsMetadata += { - OperationReporter.report( - "DCIRMedicalAct", - List("DCIR"), - OperationTypes.MedicalActs, - dcirMedicalAct.toDF, - Path(bulkConfig.output.outputSavePath), - bulkConfig.output.saveMode - ) - } - - dcirMedicalAct.unpersist() - - - val cimMedicalAct = McoCimMedicalActExtractor.extract(sources, Set.empty).cache() - - operationsMetadata += { - OperationReporter.report( - "CIM-Medical-Acts", - List("MCO"), - OperationTypes.MedicalActs, - cimMedicalAct.toDF, - Path(bulkConfig.output.outputSavePath), - bulkConfig.output.saveMode - ) - } - - cimMedicalAct.unpersist() - - - val ccamMedicalAct = McoCcamActExtractor.extract(sources, Set.empty).cache() - - operationsMetadata += { - OperationReporter.report( - "CCAM-Medical-Acts", - List("MCO"), - OperationTypes.MedicalActs, - ccamMedicalAct.toDF, - Path(bulkConfig.output.outputSavePath), - bulkConfig.output.saveMode - ) - } - - ccamMedicalAct.unpersist() - - - val liberalActs = McoCeActExtractor.extract(sources, Set.empty).cache() - - operationsMetadata += { - OperationReporter.report( - "McoCEMedicalActs", - List("MCO_ACE"), - OperationTypes.MedicalActs, - liberalActs.toDF, - Path(bulkConfig.output.outputSavePath), - bulkConfig.output.saveMode - ) - } - - liberalActs.unpersist() - - val imbActs = ImbDiagnosisExtractor.extract(sources, Set.empty).cache() - - operationsMetadata += { - OperationReporter.report( - "ImbDiagnoses", - List("IR_IMB_R"), - OperationTypes.MedicalActs, - imbActs.toDF, - Path(bulkConfig.output.outputSavePath), - bulkConfig.output.saveMode - ) - } - - imbActs.unpersist() - - val classification = GhmExtractor.extract(sources, Set.empty).cache() - - operationsMetadata += { - OperationReporter.report( - "GHM", - List("MCO"), - OperationTypes.AnyEvents, - classification.toDF, - Path(bulkConfig.output.outputSavePath), - bulkConfig.output.saveMode - ) - } - - classification.unpersist() - - val mainDiag = McoMainDiagnosisExtractor.extract(sources, Set.empty).cache() - - operationsMetadata += { - OperationReporter.report( - "MainDiagnosis", - List("MCO"), - OperationTypes.Diagnosis, - mainDiag.toDF, - Path(bulkConfig.output.outputSavePath), - bulkConfig.output.saveMode - ) - } - mainDiag.unpersist() - - val linkedDiag = McoLinkedDiagnosisExtractor.extract(sources, Set.empty).cache() - - operationsMetadata += { - OperationReporter.report( - "LinkedDiagnosis", - List("MCO"), - OperationTypes.Diagnosis, - linkedDiag.toDF, - Path(bulkConfig.output.outputSavePath), - bulkConfig.output.saveMode - ) - } - linkedDiag.unpersist() - - val associatedDiag = McoAssociatedDiagnosisExtractor.extract(sources, Set.empty).cache() - operationsMetadata += { - OperationReporter.report( - "AssociatedDiagnosis", - List("MCO"), - OperationTypes.Diagnosis, - associatedDiag.toDF, - Path(bulkConfig.output.outputSavePath), - bulkConfig.output.saveMode - ) - } - associatedDiag.unpersist() - - - val patients = new Patients(PatientsConfig(bulkConfig.base.studyStart)).extract(sources).cache() - operationsMetadata += { - OperationReporter.report( - "BasePopulation", - List("IR_BEN", "DCIR", "MCO", "MCO_CE"), - OperationTypes.Patients, - patients.toDF, - Path(bulkConfig.output.outputSavePath), - bulkConfig.output.saveMode - ) - } - patients.unpersist() - + val sourceExtractor: List[SourceExtractor] = List( + new DcirSourceExtractor(bulkConfig.output.root, bulkConfig.output.saveMode, bulkConfig.drugs), + new McoSourceExtractor(bulkConfig.output.root, bulkConfig.output.saveMode), + new McoCeSourceExtractor(bulkConfig.output.root, bulkConfig.output.saveMode), + new SsrSourceExtractor(bulkConfig.output.root, bulkConfig.output.saveMode), + new SsrCeSourceExtractor(bulkConfig.output.root, bulkConfig.output.saveMode), + new HadSourceExtractor(bulkConfig.output.root, bulkConfig.output.saveMode) + ) // Write Metadata - val metadata = MainMetadata(this.getClass.getName, startTimestamp, new java.util.Date(), operationsMetadata.toList) + val metadata = MainMetadata( + this.getClass.getName, startTimestamp, new java.util.Date(), + sourceExtractor.flatMap(se => se.extract(sources)) ++ + new PatientExtractor(bulkConfig.output.root, bulkConfig.output.saveMode, bulkConfig.base).extract(sources) + ) val metadataJson: String = metadata.toJsonString() new PrintWriter("metadata_bulk_" + format.format(startTimestamp) + ".json") { diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/DcirSourceExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/DcirSourceExtractor.scala new file mode 100644 index 00000000..b0dcae27 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/DcirSourceExtractor.scala @@ -0,0 +1,42 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.study.bulk.extractors + +import fr.polytechnique.cmap.cnam.etl.events.{Drug, MedicalAct, NgapAct, PractitionerClaimSpeciality} +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.events.acts.{DcirBiologyActExtractor, DcirMedicalActExtractor} +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.{DrugConfig, DrugExtractor} +import fr.polytechnique.cmap.cnam.etl.extractors.events.ngapacts.{DcirNgapActExtractor, NgapActConfig, NgapWithNatClassConfig} +import fr.polytechnique.cmap.cnam.etl.extractors.events.prestations.{MedicalPractitionerClaimExtractor, NonMedicalPractitionerClaimExtractor} + +class DcirSourceExtractor( + override val path: String, + override val saveMode: String, + val drugConfig: DrugConfig) extends SourceExtractor(path, saveMode) { + override val sourceName: String = "DCIR" + + override val extractors = List( + ExtractorSources[MedicalAct, SimpleExtractorCodes](DcirMedicalActExtractor(SimpleExtractorCodes.empty), List("ER_PRS_F", "ER_CAM_F", "ER_ETE_F"), "DCIR_MEDICAL_ACT"), + ExtractorSources[MedicalAct, SimpleExtractorCodes]( + DcirBiologyActExtractor(SimpleExtractorCodes.empty), + List("ER_PRS_F", "ER_BIO_F", "ER_ETE_F"), + "DCIR_BIOLOGICAL_ACT" + ), + ExtractorSources[Drug, DrugConfig](new DrugExtractor(drugConfig), List("ER_PRS_F", "IR_PHA_R"), "DRUG_PURCHASES"), + ExtractorSources[NgapAct, NgapActConfig[NgapWithNatClassConfig]]( + new DcirNgapActExtractor(NgapActConfig(List.empty)), + List("ER_PRS_F", "IR_NAT_V", "ER_ETE_F"), + "DCIR_NGAP_ACTS" + ), + ExtractorSources[PractitionerClaimSpeciality, SimpleExtractorCodes]( + MedicalPractitionerClaimExtractor(SimpleExtractorCodes.empty), + List("ER_PRS_F"), + "DCIR_MEDICAL_PRACTIONNER" + ), + ExtractorSources[PractitionerClaimSpeciality, SimpleExtractorCodes]( + NonMedicalPractitionerClaimExtractor(SimpleExtractorCodes.empty), + List("ER_PRS_F"), + "DCIR_NON_MEDICAL_PRACTIONNER" + ) + ) +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/HadSourceExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/HadSourceExtractor.scala new file mode 100644 index 00000000..f4549823 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/HadSourceExtractor.scala @@ -0,0 +1,37 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.study.bulk.extractors + +import fr.polytechnique.cmap.cnam.etl.events.{Diagnosis, HospitalStay, MedicalAct, MedicalTakeOverReason} +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.events.acts.HadCcamActExtractor +import fr.polytechnique.cmap.cnam.etl.extractors.events.diagnoses.{HadAssociatedDiagnosisExtractor, HadMainDiagnosisExtractor} +import fr.polytechnique.cmap.cnam.etl.extractors.events.hospitalstays.HadHospitalStaysExtractor +import fr.polytechnique.cmap.cnam.etl.extractors.events.takeoverreasons.{HadAssociatedTakeOverExtractor, HadMainTakeOverExtractor} + +class HadSourceExtractor(override val path: String, override val saveMode: String) extends SourceExtractor( + path, + saveMode +) { + override val sourceName: String = "HAD" + override val extractors = List( + ExtractorSources[MedicalAct, SimpleExtractorCodes](HadCcamActExtractor(SimpleExtractorCodes.empty), List("HAD_C", "HAD_A"), "HAD_CCAM_ACT"), + ExtractorSources[Diagnosis, SimpleExtractorCodes](HadMainDiagnosisExtractor(SimpleExtractorCodes.empty), List("HAD_C", "HAD_B"), "HAD_MAIN_DIAGNOSIS"), + ExtractorSources[Diagnosis, SimpleExtractorCodes](HadAssociatedDiagnosisExtractor(SimpleExtractorCodes.empty), List("HAD_C", "HAD_D"), "HAD_ASSOCIATED_DIAGNOSIS"), + ExtractorSources[MedicalTakeOverReason, SimpleExtractorCodes]( + HadMainTakeOverExtractor(SimpleExtractorCodes.empty), + List("HAD_C", "HAD_B"), + "HAD_MAIN_TAKE_OVER_REASON" + ), + ExtractorSources[MedicalTakeOverReason, SimpleExtractorCodes]( + HadAssociatedTakeOverExtractor(SimpleExtractorCodes.empty), + List("HAD_C", "HAD_B"), + "HAD_ASSOCIATED_TAKE_OVER_REASON" + ), + ExtractorSources[HospitalStay, SimpleExtractorCodes]( + HadHospitalStaysExtractor, + List("HAD_C", "HAD_B"), + "HAD_STAYS" + ) + ) +} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/ImbSourceExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/ImbSourceExtractor.scala new file mode 100644 index 00000000..1d1c0d22 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/ImbSourceExtractor.scala @@ -0,0 +1,17 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.study.bulk.extractors + +import fr.polytechnique.cmap.cnam.etl.events.Diagnosis +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.events.diagnoses.ImbCimDiagnosisExtractor + +class ImbSourceExtractor(override val path: String, override val saveMode: String) extends SourceExtractor( + path, + saveMode +) { + override val sourceName: String = "IMB_R" + override val extractors = List( + ExtractorSources[Diagnosis, SimpleExtractorCodes](ImbCimDiagnosisExtractor(SimpleExtractorCodes.empty), List("IR_IMB_R"), "ALD") + ) +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/McoCeSourceExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/McoCeSourceExtractor.scala new file mode 100644 index 00000000..a79086ea --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/McoCeSourceExtractor.scala @@ -0,0 +1,49 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.study.bulk.extractors + +import fr.polytechnique.cmap.cnam.etl.events._ +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.events.acts.McoCeCcamActExtractor +import fr.polytechnique.cmap.cnam.etl.extractors.events.hospitalstays.McoceEmergenciesExtractor +import fr.polytechnique.cmap.cnam.etl.extractors.events.ngapacts.{McoCeFbstcNgapActExtractor, McoCeFcstcNgapActExtractor, NgapActClassConfig, NgapActConfig} +import fr.polytechnique.cmap.cnam.etl.extractors.events.prestations.{McoCeFbstcSpecialtyExtractor, McoCeFcstcSpecialtyExtractor} + +class McoCeSourceExtractor(override val path: String, override val saveMode: String) extends SourceExtractor( + path, + saveMode +) { + override val sourceName: String = "MCO_CE" + override val extractors = List( + ExtractorSources[MedicalAct, SimpleExtractorCodes]( + McoCeCcamActExtractor(SimpleExtractorCodes.empty), + List("MCO_CSTC", "MCO_FMSTC"), + "MCO_CE_CCAM_ACTS" + ), + ExtractorSources[NgapAct, NgapActConfig[NgapActClassConfig]]( + new McoCeFbstcNgapActExtractor(NgapActConfig(List.empty)), + List("MCO_CSTC", "MCO_FBSTC"), + "MCO_CE_FBSTC_NGAP_ACTS" + ), + ExtractorSources[NgapAct, NgapActConfig[NgapActClassConfig]]( + new McoCeFcstcNgapActExtractor(NgapActConfig(List.empty)), + List("MCO_CSTC", "MCO_FCSTC"), + "MCO_CE_FCSTC_NGAP_ACTS" + ), + ExtractorSources[PractitionerClaimSpeciality, SimpleExtractorCodes]( + McoCeFbstcSpecialtyExtractor(SimpleExtractorCodes.empty), + List("MCO_CSTC", "MCO_FBSTC"), + "MCO_CE_FBSTC_PRACTITIONER_SPECIALITY" + ), + ExtractorSources[PractitionerClaimSpeciality, SimpleExtractorCodes]( + McoCeFcstcSpecialtyExtractor(SimpleExtractorCodes.empty), + List("MCO_CSTC", "MCO_FCSTC"), + "MCO_CE_FCSTC_PRACTITIONER_SPECIALITY" + ), + ExtractorSources[HospitalStay, SimpleExtractorCodes]( + McoceEmergenciesExtractor, + List("MCO_CSTC", "MCO_FBSTC"), + "MCO_CE_EMERGENCY_VISIT" + ) + ) +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/McoSourceExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/McoSourceExtractor.scala new file mode 100644 index 00000000..1694f927 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/McoSourceExtractor.scala @@ -0,0 +1,44 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.study.bulk.extractors + +import fr.polytechnique.cmap.cnam.etl.events._ +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.events.acts.McoCcamActExtractor +import fr.polytechnique.cmap.cnam.etl.extractors.events.diagnoses.{McoAssociatedDiagnosisExtractor, McoLinkedDiagnosisExtractor, McoMainDiagnosisExtractor} +import fr.polytechnique.cmap.cnam.etl.extractors.events.hospitalstays.McoHospitalStaysExtractor + +class McoSourceExtractor(override val path: String, override val saveMode: String) extends SourceExtractor( + path, + saveMode +) { + override val sourceName: String = "MCO" + override val extractors = List( + ExtractorSources[MedicalAct, SimpleExtractorCodes]( + McoCcamActExtractor(SimpleExtractorCodes.empty), + List("MCO_C", "MCO_A"), + "MCO_CCAM_ACT" + ), + ExtractorSources[Diagnosis, SimpleExtractorCodes]( + McoMainDiagnosisExtractor(SimpleExtractorCodes.empty), + List("MCO_C", "MCO_B"), + "MCO_MAIN_DIAGNOSIS" + ), + ExtractorSources[Diagnosis, SimpleExtractorCodes]( + McoAssociatedDiagnosisExtractor(SimpleExtractorCodes.empty), + List("MCO_C", "MCO_B", "MCO_D"), + "MCO_ASSOCIATED_DIAGNOSIS" + ), + ExtractorSources[Diagnosis, SimpleExtractorCodes]( + McoLinkedDiagnosisExtractor(SimpleExtractorCodes.empty), + List("MCO_C", "MCO_B"), + "MCO_LINKED_DIAGNOSIS" + ), + ExtractorSources[HospitalStay, SimpleExtractorCodes]( + McoHospitalStaysExtractor, + List("MCO_C", "MCO_B"), + "MCO_HOSPITAL_STAY" + ) + ) + +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/PatientExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/PatientExtractor.scala new file mode 100644 index 00000000..38d452a8 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/PatientExtractor.scala @@ -0,0 +1,28 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.study.bulk.extractors + +import fr.polytechnique.cmap.cnam.etl.config.BaseConfig +import fr.polytechnique.cmap.cnam.etl.extractors.patients.{AllPatientExtractor, PatientsConfig} +import fr.polytechnique.cmap.cnam.etl.sources.Sources +import fr.polytechnique.cmap.cnam.etl.transformers.patients.PatientFilters +import fr.polytechnique.cmap.cnam.util.Path +import fr.polytechnique.cmap.cnam.util.reporting.{OperationMetadata, OperationReporter, OperationTypes} + +class PatientExtractor(val path: String, val saveMode: String, val baseConfig: BaseConfig) { + def extract(sources: Sources): List[OperationMetadata] = { + val patients = new PatientFilters(PatientsConfig(baseConfig.studyStart)).filterPatients(AllPatientExtractor.extract(sources)) + List( + OperationReporter + .report( + "all_patients", + List("DCIR", "MCO", "IR_BEN_R", "MCO_CE"), + OperationTypes.Patients, + patients.toDF, + Path(path), + saveMode + ) + ) + + } +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/SourceExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/SourceExtractor.scala new file mode 100644 index 00000000..2c5beaa9 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/SourceExtractor.scala @@ -0,0 +1,82 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.study.bulk.extractors + +import scala.reflect.runtime.universe._ +import scala.util.{Failure, Success, Try} +import org.apache.log4j.Logger +import org.apache.spark.sql.Dataset +import fr.polytechnique.cmap.cnam.etl.events._ +import fr.polytechnique.cmap.cnam.etl.extractors.Extractor +import fr.polytechnique.cmap.cnam.etl.extractors.codes.ExtractorCodes +import fr.polytechnique.cmap.cnam.etl.sources.Sources +import fr.polytechnique.cmap.cnam.util.Path +import fr.polytechnique.cmap.cnam.util.reporting.{OperationMetadata, OperationReporter, OperationTypes} + +/** + * Extract all available Events from the given source. + * + * This regroups all the available Extractors for a given source and execute them on the Source. If the running passes, + * then the result is stored in the given path and saved in the OperationMetadata. If the running fails, then the + * logger warns the user that the running failed, indicating the missing tables that must flattened. + * + * Every implementation of this abstract class must updated whenever a new Extractor that works on the given Source is + * added. + */ +abstract class SourceExtractor(val path: String, val saveMode: String) { + val sourceName: String + + // This the ugliest bit of this implementation, and there is no way getting around it because of Spark. + // First, Spark Dataset is invariant hence no way of making the Extractor trait covariant to avoid this ugly upper + // bounding. + // Second, TypeTag is needed for the Spark encoder for case class, hence the explicit typing instead of AnyEvent. + // @TODO: Every time you add a new Event type you will need to add it in the "with" clause + val extractors: List[ExtractorSources[_ >: MedicalAct with HospitalStay with Diagnosis with Drug + with MedicalTakeOverReason with NgapAct with PractitionerClaimSpeciality <: AnyEvent with EventBuilder, + ExtractorCodes]] + private val logger = Logger.getLogger(this.getClass) + + /** + * Extract all Events from the Source and returns a List of OperationMetadata. + * + * @param sources Sources object containing the sources. + * @return OperationMetadata containing all Events extracted. + */ + def extract(sources: Sources): List[OperationMetadata] = extractors.flatMap(es => runAndReport(sources)(es)) + + def runAndReport[A <: AnyEvent : TypeTag](sources: Sources)(es: ExtractorSources[A, ExtractorCodes]): Option[OperationMetadata] = + run(es.extractor, sources) match { + case Success(tde) => Some(report(es, tde)) + case Failure(error) => { + logger.warn( + "Extractor " + es + .extractor + " failed, probably you didn't flatten all the following tables" + es.sources + ) + None + } + } + + def run[A <: AnyEvent : TypeTag](extractor: Extractor[A, ExtractorCodes], sources: Sources): Try[Dataset[Event[A]]] = { + Try { + extractor.extract(sources)(typeTag[A]) + } + } + + def report[A <: AnyEvent : TypeTag]( + extractorSources: ExtractorSources[A, ExtractorCodes], + result: Dataset[Event[A]]): OperationMetadata = OperationReporter + .report( + extractorSources.name, + extractorSources.sources, + OperationTypes.AnyEvents, + result.toDF, + Path(path), + saveMode + ) +} + + +case class ExtractorSources[EventType <: AnyEvent : TypeTag, +Codes <: ExtractorCodes]( + extractor: Extractor[EventType, Codes], + sources: List[String], + name: String) \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/SsrCeSourceExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/SsrCeSourceExtractor.scala new file mode 100644 index 00000000..91bd1f50 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/SsrCeSourceExtractor.scala @@ -0,0 +1,21 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.study.bulk.extractors + +import fr.polytechnique.cmap.cnam.etl.events.MedicalAct +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.events.acts.SsrCeActExtractor + +class SsrCeSourceExtractor(override val path: String, override val saveMode: String) extends SourceExtractor( + path, + saveMode +) { + override val sourceName: String = "SSR_CE" + override val extractors = List( + ExtractorSources[MedicalAct, SimpleExtractorCodes]( + SsrCeActExtractor(SimpleExtractorCodes.empty), + List("SSR_CSTC", "SSR_FMSTC"), + "SSR_CE_CCAM" + ) + ) +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/SsrSourceExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/SsrSourceExtractor.scala new file mode 100644 index 00000000..d3d2b613 --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/SsrSourceExtractor.scala @@ -0,0 +1,49 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.study.bulk.extractors + +import fr.polytechnique.cmap.cnam.etl.events.{Diagnosis, HospitalStay, MedicalAct} +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.events.acts.{SsrCcamActExtractor, SsrCsarrActExtractor} +import fr.polytechnique.cmap.cnam.etl.extractors.events.diagnoses._ +import fr.polytechnique.cmap.cnam.etl.extractors.events.hospitalstays.SsrHospitalStaysExtractor + +class SsrSourceExtractor(override val path: String, override val saveMode: String) extends SourceExtractor( + path, + saveMode +) { + override val sourceName: String = "SSR" + override val extractors = List( + ExtractorSources[MedicalAct, SimpleExtractorCodes]( + SsrCcamActExtractor(SimpleExtractorCodes.empty), + List("SSR_C", "SSR_CCAM"), + "SSR_CCAM" + ), + ExtractorSources[MedicalAct, SimpleExtractorCodes]( + SsrCsarrActExtractor(SimpleExtractorCodes.empty), + List("SSR_C", "SSR_CSARR"), + "SSR_CSARR" + ), + ExtractorSources[Diagnosis, SimpleExtractorCodes]( + SsrMainDiagnosisExtractor(SimpleExtractorCodes.empty), + List("SSR_C", "SSR_B"), + "SSR_MAIN_DIAGNOSIS" + ), + ExtractorSources[Diagnosis, SimpleExtractorCodes]( + SsrLinkedDiagnosisExtractor(SimpleExtractorCodes.empty), + List("SSR_C", "SSR_B"), + "SSR_LINKED_DIAGNOSIS" + ), + ExtractorSources[Diagnosis, SimpleExtractorCodes]( + SsrAssociatedDiagnosisExtractor(SimpleExtractorCodes.empty), + List("SSR_C", "SSR_D"), + "SSR_ASSOCIATED_DIAGNOSIS" + ), + ExtractorSources[Diagnosis, SimpleExtractorCodes]( + SsrTakingOverPurposeExtractor(SimpleExtractorCodes.empty), + List("SSR_C", "SSR_B"), + "SSR_TAKE_OVER_REASON" + ), + ExtractorSources[HospitalStay, SimpleExtractorCodes](SsrHospitalStaysExtractor, List("SSR_C", "SSR_B"), "SSR_STAY") + ) +} \ No newline at end of file diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/dreesChronic/extractors/PractitionnerClaimSpecialityExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/dreesChronic/extractors/PractitionnerClaimSpecialityExtractor.scala deleted file mode 100644 index 7e0aeac3..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/dreesChronic/extractors/PractitionnerClaimSpecialityExtractor.scala +++ /dev/null @@ -1,20 +0,0 @@ -package fr.polytechnique.cmap.cnam.study.dreesChronic.extractors - - -import fr.polytechnique.cmap.cnam.etl.events.{PractitionerClaimSpeciality, Event} -import fr.polytechnique.cmap.cnam.etl.extractors.prestations._ -import fr.polytechnique.cmap.cnam.etl.sources.Sources -import fr.polytechnique.cmap.cnam.util.functions.unionDatasets -import org.apache.spark.sql.Dataset - -class PractitionnerClaimSpecialityExtractor(config: PractitionerClaimSpecialityConfig) { - - def extract(sources: Sources): Dataset[Event[PractitionerClaimSpeciality]] = { - - val nonMedicalSpeciality = NonMedicalPractitionerClaimExtractor.extract(sources, config.nonMedicalSpeCodes.toSet) - val medicalSpeciality = MedicalPractitionerClaimExtractor.extract(sources, config.medicalSpeCodes.toSet) - - unionDatasets(nonMedicalSpeciality, medicalSpeciality) - - } -} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/FallMain.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/FallMain.scala index 33faad92..957b310d 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/FallMain.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/FallMain.scala @@ -5,15 +5,18 @@ package fr.polytechnique.cmap.cnam.study.fall import scala.collection.mutable import org.apache.spark.sql.{Dataset, SQLContext} import fr.polytechnique.cmap.cnam.Main -import fr.polytechnique.cmap.cnam.etl.events.{Event, FollowUp, Outcome} -import fr.polytechnique.cmap.cnam.etl.extractors.hospitalstays.McoHospitalStaysExtractor -import fr.polytechnique.cmap.cnam.etl.extractors.patients.{Patients, PatientsConfig} +import fr.polytechnique.cmap.cnam.etl.events.{Drug, Event, FollowUp, Outcome} +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.events.hospitalstays.McoHospitalStaysExtractor +import fr.polytechnique.cmap.cnam.etl.extractors.patients.{AllPatientExtractor, PatientsConfig} import fr.polytechnique.cmap.cnam.etl.filters.PatientFilters import fr.polytechnique.cmap.cnam.etl.implicits import fr.polytechnique.cmap.cnam.etl.patients.Patient import fr.polytechnique.cmap.cnam.etl.sources.Sources +import fr.polytechnique.cmap.cnam.etl.transformers.drugprescription.DrugPrescriptionTransformer import fr.polytechnique.cmap.cnam.etl.transformers.exposures.ExposureTransformer import fr.polytechnique.cmap.cnam.etl.transformers.interaction.NLevelInteractionTransformer +import fr.polytechnique.cmap.cnam.etl.transformers.patients.PatientFilters import fr.polytechnique.cmap.cnam.study.fall.codes._ import fr.polytechnique.cmap.cnam.study.fall.config.FallConfig import fr.polytechnique.cmap.cnam.study.fall.extractors._ @@ -40,9 +43,7 @@ object FallMain extends Main with FractureCodes { val dcir = sources.dcir.get.repartition(4000).persist() val mco = sources.mco.get.repartition(4000).persist() - val operationsMetadata = computeControls(sources, fallConfig) ++ - computeExposures(sources, fallConfig) ++ - computeOutcomes(sources, fallConfig) + val operationsMetadata = computeOutcomes(sources, fallConfig) dcir.unpersist() mco.unpersist() @@ -60,7 +61,7 @@ object FallMain extends Main with FractureCodes { def computeHospitalStays(sources: Sources, fallConfig: FallConfig): mutable.Buffer[OperationMetadata] = { val operationsMetadata = mutable.Buffer[OperationMetadata]() if (fallConfig.runParameters.hospitalStays) { - val hospitalStays = McoHospitalStaysExtractor.extract(sources, Set.empty).cache() + val hospitalStays = McoHospitalStaysExtractor.extract(sources).cache() operationsMetadata += { OperationReporter @@ -99,27 +100,40 @@ object FallMain extends Main with FractureCodes { None } - val optionPatients = if (fallConfig.runParameters.patients) { - val patients = new Patients(PatientsConfig(fallConfig.base.studyStart)).extract(sources).cache() + val optionAllPatients = if (fallConfig.runParameters.patients) { + val allpatients = AllPatientExtractor.extract(sources).cache() operationsMetadata += { OperationReporter .report( - "extract_patients", + "extract_raw_patients", List("DCIR", "MCO", "IR_BEN_R", "MCO_CE"), OperationTypes.Patients, - patients.toDF, + allpatients.toDF, Path(fallConfig.output.outputSavePath), fallConfig.output.saveMode ) } - Some(patients) + Some(allpatients) } else { None } + val filteredpatientsconfig = new PatientFilters(PatientsConfig(fallConfig.base.studyStart)).filterPatients(optionAllPatients.get).cache() + operationsMetadata += { + OperationReporter + .report( + "extract_filtered_patients", + List("DCIR", "MCO", "IR_BEN_R", "MCO_CE"), + OperationTypes.Patients, + filteredpatientsconfig.toDF, + Path(fallConfig.output.outputSavePath), + fallConfig.output.saveMode + ) + } + if (fallConfig.runParameters.startGapPatients) { import PatientFilters._ - val filteredPatients: Dataset[Patient] = optionPatients.get + val filteredPatients: Dataset[Patient] = filteredpatientsconfig .filterNoStartGap(optionDrugPurchases.get, fallConfig.base.studyStart, fallConfig.patients.startGapInMonths) operationsMetadata += { OperationReporter @@ -139,7 +153,7 @@ object FallMain extends Main with FractureCodes { val definition = fallConfig.exposures val patientsWithFollowUp: Dataset[(Patient, Event[FollowUp])] = FallStudyFollowUps .transform( - optionPatients.get, + filteredpatientsconfig, fallConfig.base.studyStart, fallConfig.base.studyEnd, fallConfig.patients.followupStartDelay @@ -184,6 +198,34 @@ object FallMain extends Main with FractureCodes { ) } + val prescriptions = new DrugPrescriptionTransformer().transform(optionDrugPurchases.get).cache() + + operationsMetadata += { + OperationReporter + .report( + "prescriptions", + List("drug_purchases"), + OperationTypes.Dispensations, + prescriptions.toDF, + Path(fallConfig.output.outputSavePath), + fallConfig.output.saveMode + ) + } + + val prescriptionsExposures = new ExposureTransformer(definition) + .transform(patientsWithFollowUp.map(_._2).distinct())(prescriptions.as[Event[Drug]]).cache() + operationsMetadata += { + OperationReporter + .report( + "prescriptions_exposures", + List("prescriptions", "follow_up"), + OperationTypes.Exposures, + prescriptionsExposures.toDF, + Path(fallConfig.output.outputSavePath), + fallConfig.output.saveMode + ) + } + new ExposureTransformer(definition) .transform(patientsWithFollowUp.map(_._2).distinct())(optionDrugPurchases.get) } @@ -212,7 +254,6 @@ object FallMain extends Main with FractureCodes { ) } } - operationsMetadata } @@ -221,28 +262,28 @@ object FallMain extends Main with FractureCodes { val operationsMetadata = mutable.Buffer[OperationMetadata]() val optionDiagnoses = if (fallConfig.runParameters.diagnoses) { - logger.info("diagnoses") val diagnoses = new DiagnosisExtractor(fallConfig.diagnoses).extract(sources).persist() val diagnosesPopulation = DiagnosisCounter.process(diagnoses) operationsMetadata += { OperationReporter.reportDataAndPopulationAsDataSet( - "diagnoses", - List("MCO", "IR_IMB_R"), - OperationTypes.Diagnosis, - diagnoses, - diagnosesPopulation, - Path(fallConfig.output.outputSavePath), - fallConfig.output.saveMode - ) + "diagnoses", + List("MCO", "IR_IMB_R"), + OperationTypes.Diagnosis, + diagnoses, + diagnosesPopulation, + Path(fallConfig.output.outputSavePath), + fallConfig.output.saveMode + ) } Some(diagnoses) } else { None } - val (optionActs, optionLiberalActs) = if (fallConfig.runParameters.acts) { - logger.info("Medical Acts") - val acts = new ActsExtractor(fallConfig.medicalActs).extract(sources).persist() + val (optionActs, optionLiberalActs, optionSurgeries) = if (fallConfig.runParameters.acts) { + val (acts, surgeries) = new ActsExtractor(fallConfig.medicalActs).extract(sources) + acts.cache() + surgeries.cache() operationsMetadata += { OperationReporter .report( @@ -254,7 +295,17 @@ object FallMain extends Main with FractureCodes { fallConfig.output.saveMode ) } - logger.info("Liberal Medical Acts") + operationsMetadata += { + OperationReporter + .report( + "fracture_surgeries", + List("MCO"), + OperationTypes.MedicalActs, + surgeries.toDF, + Path(fallConfig.output.outputSavePath), + fallConfig.output.saveMode + ) + } val liberalActs = LiberalActsTransformer.transform(acts).persist() operationsMetadata += { OperationReporter @@ -267,20 +318,36 @@ object FallMain extends Main with FractureCodes { fallConfig.output.saveMode ) } - (Some(acts), Some(liberalActs)) + (Some(acts), Some(liberalActs), Some(surgeries)) } else { - (None, None) + (None, None, None) + } + val optionHospitalDeaths = if (fallConfig.runParameters.hospitalDeaths) { + val hospitalDeaths = new FallHospitalStayExtractor(SimpleExtractorCodes(List(Death.value))).extract(sources) + operationsMetadata += { + OperationReporter + .report( + "hospital_deaths", + List("MCO"), + OperationTypes.HospitalStays, + hospitalDeaths.toDF, + Path(fallConfig.output.outputSavePath), + fallConfig.output.saveMode + ) + } + Some(hospitalDeaths) + } else { + None } if (fallConfig.runParameters.outcomes) { - logger.info("Fractures") val fractures: Dataset[Event[Outcome]] = new FracturesTransformer(fallConfig) - .transform(optionLiberalActs.get, optionActs.get, optionDiagnoses.get) + .transform(optionLiberalActs.get, optionActs.get, optionDiagnoses.get, optionSurgeries.get, optionHospitalDeaths.get) operationsMetadata += { OperationReporter .report( "fractures", - List("acts"), + List("acts", "diagnoses"), OperationTypes.Outcomes, fractures.toDF, Path(fallConfig.output.outputSavePath), @@ -288,7 +355,6 @@ object FallMain extends Main with FractureCodes { ) } } - operationsMetadata } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/FallMainExtract.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/FallMainExtract.scala index 1627e5fa..762e7a52 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/FallMainExtract.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/FallMainExtract.scala @@ -6,11 +6,13 @@ import scala.collection.mutable import org.apache.spark.sql.{Dataset, SQLContext} import fr.polytechnique.cmap.cnam.Main import fr.polytechnique.cmap.cnam.etl.events.DcirAct -import fr.polytechnique.cmap.cnam.etl.extractors.hospitalstays.McoHospitalStaysExtractor -import fr.polytechnique.cmap.cnam.etl.extractors.patients.{Patients, PatientsConfig} +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.events.hospitalstays.McoHospitalStaysExtractor +import fr.polytechnique.cmap.cnam.etl.extractors.patients.{AllPatientExtractor, PatientsConfig} import fr.polytechnique.cmap.cnam.etl.implicits import fr.polytechnique.cmap.cnam.etl.patients.Patient import fr.polytechnique.cmap.cnam.etl.sources.Sources +import fr.polytechnique.cmap.cnam.etl.transformers.patients.PatientFilters import fr.polytechnique.cmap.cnam.study.fall.codes._ import fr.polytechnique.cmap.cnam.study.fall.config.FallConfig import fr.polytechnique.cmap.cnam.study.fall.extractors._ @@ -43,7 +45,7 @@ object FallMainExtract extends Main with FractureCodes { mutable.HashMap[String, OperationMetadata] = { if (fallConfig.runParameters.hospitalStays) { - val hospitalStays = McoHospitalStaysExtractor.extract(sources, Set.empty).cache() + val hospitalStays = McoHospitalStaysExtractor.extract(sources).cache() meta += { "extract_hospital_stays" -> OperationReporter @@ -92,21 +94,41 @@ object FallMainExtract extends Main with FractureCodes { } } - if (fallConfig.runParameters.patients) { - val patients: Dataset[Patient] = new Patients(PatientsConfig(fallConfig.base.studyStart)).extract(sources).cache() + val optionAllPatients = if (fallConfig.runParameters.patients) { + val allpatients: Dataset[Patient] = AllPatientExtractor.extract(sources).cache() meta += { - "extract_patients" -> + "extract_raw_patients" -> OperationReporter .reportAsDataSet( - "extract_patients", + "raw_patients", List("DCIR", "MCO", "IR_BEN_R", "MCO_CE"), OperationTypes.Patients, - patients, + allpatients, Path(fallConfig.output.outputSavePath), fallConfig.output.saveMode ) } + + Some(allpatients) + } else { + None } + + if (fallConfig.runParameters.patients) { + val filteredpatients: Dataset[Patient] = new PatientFilters(PatientsConfig(fallConfig.base.studyStart)).filterPatients(optionAllPatients.get).cache() + meta += { + "extract_filtered_patients" -> + OperationReporter + .reportAsDataSet( + "filtered_patients", + List("DCIR", "MCO", "IR_BEN_R", "MCO_CE"), + OperationTypes.Patients, + filteredpatients, + Path(fallConfig.output.outputSavePath), + fallConfig.output.saveMode + ) + } + } meta } @@ -114,10 +136,9 @@ object FallMainExtract extends Main with FractureCodes { mutable.HashMap[String, OperationMetadata] = { if (fallConfig.runParameters.diagnoses) { - logger.info("diagnoses") val diagnoses = new DiagnosisExtractor(fallConfig.diagnoses).extract(sources).persist() val diagnosesPopulation = DiagnosisCounter.process(diagnoses) - val diagnoses_report = OperationReporter.reportDataAndPopulationAsDataSet( + val diagnosesReport = OperationReporter.reportDataAndPopulationAsDataSet( "diagnoses", List("MCO", "IR_IMB_R"), OperationTypes.Diagnosis, @@ -127,14 +148,15 @@ object FallMainExtract extends Main with FractureCodes { fallConfig.output.saveMode ) meta += { - diagnoses_report.name -> diagnoses_report + diagnosesReport.name -> diagnosesReport } } if (fallConfig.runParameters.acts) { - logger.info("Medical Acts") - val acts = new ActsExtractor(fallConfig.medicalActs).extract(sources).persist() - val acts_report = OperationReporter.reportAsDataSet( + val (acts, surgeries) = new ActsExtractor(fallConfig.medicalActs).extract(sources) + acts.persist() + surgeries.persist() + val actsReport = OperationReporter.reportAsDataSet( "acts", List("DCIR", "MCO", "MCO_CE"), OperationTypes.MedicalActs, @@ -143,9 +165,21 @@ object FallMainExtract extends Main with FractureCodes { fallConfig.output.saveMode ) meta += { - acts_report.name -> acts_report + actsReport.name -> actsReport } - logger.info("Liberal Medical Acts") + + val surgeriesReport = OperationReporter.reportAsDataSet( + "surgeries", + List("MCO"), + OperationTypes.MedicalActs, + surgeries, + Path(fallConfig.output.outputSavePath), + fallConfig.output.saveMode + ) + meta += { + surgeriesReport.name -> surgeriesReport + } + val liberalActs = acts .filter(act => act.groupID == DcirAct.groupID.Liberal && !CCAMExceptions.contains(act.value)).persist() val liberal_acts_report = OperationReporter.reportAsDataSet( @@ -159,6 +193,19 @@ object FallMainExtract extends Main with FractureCodes { meta += { liberal_acts_report.name -> liberal_acts_report } + + val hospitalDeaths = new FallHospitalStayExtractor(SimpleExtractorCodes(List(Death.value))).extract(sources) + val hospitalDeathsReport = OperationReporter.reportAsDataSet( + "hospital_deaths", + List("MCO"), + OperationTypes.HospitalStays, + hospitalDeaths, + Path(fallConfig.output.outputSavePath), + fallConfig.output.saveMode + ) + meta += { + hospitalDeathsReport.name -> hospitalDeathsReport + } } meta } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/FallMainTransform.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/FallMainTransform.scala index 6e69ce6e..40a04c03 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/FallMainTransform.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/FallMainTransform.scala @@ -8,6 +8,7 @@ import fr.polytechnique.cmap.cnam.Main import fr.polytechnique.cmap.cnam.etl.events._ import fr.polytechnique.cmap.cnam.etl.filters.PatientFilters import fr.polytechnique.cmap.cnam.etl.patients.Patient +import fr.polytechnique.cmap.cnam.etl.transformers.drugprescription.DrugPrescriptionTransformer import fr.polytechnique.cmap.cnam.etl.transformers.exposures.ExposureTransformer import fr.polytechnique.cmap.cnam.etl.transformers.interaction.NLevelInteractionTransformer import fr.polytechnique.cmap.cnam.study.fall.codes._ @@ -49,7 +50,7 @@ object FallMainTransform extends Main with FractureCodes { val spark = SparkSession.builder.getOrCreate() import spark.implicits._ val patients: Dataset[Patient] - = spark.read.parquet(meta.get("extract_patients").get.outputPath) + = spark.read.parquet(meta.get("extract_filtered_patients").get.outputPath) .as[Patient].cache() val drugPurchases: Dataset[Event[Drug]] = spark.read.parquet(meta.get("drug_purchases").get.outputPath) @@ -111,6 +112,37 @@ object FallMainTransform extends Main with FractureCodes { fallConfig.output.saveMode ) } + + val prescriptions = new DrugPrescriptionTransformer().transform(drugPurchases).cache() + + meta += { + "prescriptions" -> + OperationReporter + .report( + "prescriptions", + List("drug_purchases"), + OperationTypes.Dispensations, + prescriptions.toDF, + Path(fallConfig.output.outputSavePath), + fallConfig.output.saveMode + ) + } + + val prescriptionsExposures = new ExposureTransformer(definition) + .transform(patientsWithFollowUp.map(_._2).distinct())(prescriptions.as[Event[Drug]]).cache() + meta += { + "prescriptions_exposures" -> + OperationReporter + .report( + "prescriptions_exposures", + List("prescriptions", "follow_up"), + OperationTypes.Exposures, + prescriptionsExposures.toDF, + Path(fallConfig.output.outputSavePath), + fallConfig.output.saveMode + ) + } + new ExposureTransformer(definition).transform(patientsWithFollowUp.map(_._2))(drugPurchases).cache() } val exposuresReport = OperationReporter.reportAsDataSet( @@ -158,10 +190,15 @@ object FallMainTransform extends Main with FractureCodes { val liberalActs = spark.read.parquet(meta("liberal_acts").outputPath) .as[Event[MedicalAct]] + val surgeries = spark.read.parquet(meta("surgeries").outputPath) + .as[Event[MedicalAct]] + + val hospitalDeaths = spark.read.parquet(meta("hospital_deaths").outputPath) + .as[Event[HospitalStay]] + if (fallConfig.runParameters.outcomes) { - logger.info("Fractures") val fractures: Dataset[Event[Outcome]] = new FracturesTransformer(fallConfig) - .transform(liberalActs, acts, diagnoses) + .transform(liberalActs, acts, diagnoses, surgeries, hospitalDeaths) val fractures_report = OperationReporter.reportAsDataSet( "fractures", List("acts"), diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/config/FallConfig.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/config/FallConfig.scala index a5124dc0..ec97c355 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/config/FallConfig.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/config/FallConfig.scala @@ -8,12 +8,12 @@ import me.danielpes.spark.datetime.implicits._ import pureconfig.generic.auto._ import fr.polytechnique.cmap.cnam.etl.config.BaseConfig import fr.polytechnique.cmap.cnam.etl.config.study.StudyConfig -import fr.polytechnique.cmap.cnam.etl.extractors.acts.MedicalActsConfig -import fr.polytechnique.cmap.cnam.etl.extractors.diagnoses.DiagnosesConfig -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.DrugConfig -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification._ -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.families.{Antidepresseurs, Antihypertenseurs, Hypnotiques, Neuroleptiques} -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.level.{DrugClassificationLevel, TherapeuticLevel} +import fr.polytechnique.cmap.cnam.etl.extractors.events.acts.MedicalActsConfig +import fr.polytechnique.cmap.cnam.etl.extractors.events.diagnoses.DiagnosesConfig +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.DrugConfig +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification._ +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.families.{Antidepresseurs, Antihypertenseurs, Hypnotiques, Neuroleptiques} +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level.{DrugClassificationLevel, TherapeuticLevel} import fr.polytechnique.cmap.cnam.etl.transformers.exposures._ import fr.polytechnique.cmap.cnam.etl.transformers.interaction.InteractionTransformerConfig import fr.polytechnique.cmap.cnam.study.fall.codes._ @@ -89,8 +89,9 @@ object FallConfig extends FallConfigLoader with FractureCodes { /** Parameters needed for the Interaction Transformer **/ case class InteractionConfig( - override val level: Int = 2 - ) extends InteractionTransformerConfig(level = level) + override val level: Int = 2, + override val minimumDuration: Period = 30.days + ) extends InteractionTransformerConfig(level = level, minimumDuration) /** Parameters needed for the diagnosesConfig **/ case class SitesConfig(sites: List[BodySite] = List(BodySites)) { @@ -99,7 +100,7 @@ object FallConfig extends FallConfigLoader with FractureCodes { /** Parameters if run the calculation of outcome or exposure **/ case class RunConfig( - outcome: List[String] = List("Acts", "Diagnoses", "Outcomes"), + outcome: List[String] = List("Acts", "Diagnoses", "HospitalDeaths", "Outcomes"), exposure: List[String] = List("Patients", "StartGapPatients", "DrugPurchases", "Exposures"), hospitalStay: List[String] = List("HospitalStay")) { //exposures @@ -110,7 +111,8 @@ object FallConfig extends FallConfigLoader with FractureCodes { //outcomes val diagnoses: Boolean = outcome contains "Diagnoses" val acts: Boolean = outcome contains "Acts" - val outcomes: Boolean = List("Diagnoses", "Acts", "Outcomes").forall(outcome.contains) + val hospitalDeaths: Boolean = outcome contains "HospitalDeaths" + val outcomes: Boolean = List("Diagnoses", "Acts", "HospitalDeaths", "Outcomes").forall(outcome.contains) // Hospital Stays val hospitalStays: Boolean = hospitalStay contains "HospitalStay" } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/config/FallConfigLoader.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/config/FallConfigLoader.scala index 514bb6ec..98edb02a 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/config/FallConfigLoader.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/config/FallConfigLoader.scala @@ -4,7 +4,7 @@ package fr.polytechnique.cmap.cnam.study.fall.config import pureconfig.ConfigReader import fr.polytechnique.cmap.cnam.etl.config.ConfigLoader -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.DrugClassConfig +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.DrugClassConfig import fr.polytechnique.cmap.cnam.study.fall.fractures.BodySite class FallConfigLoader extends ConfigLoader { diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/config/FallDrugClassConfig.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/config/FallDrugClassConfig.scala index 9c0211cf..cfae338b 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/config/FallDrugClassConfig.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/config/FallDrugClassConfig.scala @@ -2,8 +2,8 @@ package fr.polytechnique.cmap.cnam.study.fall.config -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification._ -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.families.{Antidepresseurs, Antihypertenseurs, Hypnotiques, Neuroleptiques} +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification._ +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.families.{Antidepresseurs, Antihypertenseurs, Hypnotiques, Neuroleptiques} object FallDrugClassConfig { diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/ActsExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/ActsExtractor.scala index 758a45be..1f0041dc 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/ActsExtractor.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/ActsExtractor.scala @@ -4,19 +4,29 @@ package fr.polytechnique.cmap.cnam.study.fall.extractors import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.etl.events.{DcirAct, Event, MedicalAct} -import fr.polytechnique.cmap.cnam.etl.extractors.acts._ +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.events.acts._ import fr.polytechnique.cmap.cnam.etl.sources.Sources +import fr.polytechnique.cmap.cnam.study.fall.fractures.Surgery import fr.polytechnique.cmap.cnam.util.functions.unionDatasets -class ActsExtractor(config: MedicalActsConfig) { - def extract(sources: Sources): Dataset[Event[MedicalAct]] = { - val dcirMedicalAct = DcirMedicalActExtractor.extract(sources, config.dcirCodes.toSet) +class ActsExtractor(config: MedicalActsConfig) extends Serializable { + def extract(sources: Sources): (Dataset[Event[MedicalAct]], Dataset[Event[MedicalAct]]) = { + val dcirMedicalAct = DcirMedicalActExtractor(SimpleExtractorCodes(config.dcirCodes)).extract(sources) .filter(act => act.groupID != DcirAct.groupID.Unknown) // filter out unknown source acts .filter(act => act.groupID != DcirAct.groupID.PublicAmbulatory) //filter out public amb - val mcoCEMedicalActs = McoCeActExtractor.extract(sources, config.mcoCECodes.toSet) - val mcoMedicalActs = McoCcamActExtractor.extract(sources, config.mcoCCAMCodes.toSet) + val mcoCEMedicalActs = McoCeCcamActExtractor(SimpleExtractorCodes(config.mcoCECodes)).extract(sources) - unionDatasets(dcirMedicalAct, mcoCEMedicalActs, mcoMedicalActs) + val surgeryCodes = Surgery.surgeryCodes + val ccamCodes = config.mcoCCAMCodes + val allMcoActs = McoCcamActExtractor(SimpleExtractorCodes(ccamCodes ++ surgeryCodes)) + .extract(sources) + .cache() + + val fractureSurgeries = allMcoActs.filter(md => surgeryCodes.exists(code => code.startsWith(md.value))) + val mcoMedicalActs = allMcoActs.filter(md => ccamCodes.exists(code => code.startsWith(md.value))) + + (unionDatasets(dcirMedicalAct, mcoCEMedicalActs, mcoMedicalActs), fractureSurgeries) } } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/CardiacExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/CardiacExtractor.scala index b104cb86..f67581bf 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/CardiacExtractor.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/CardiacExtractor.scala @@ -4,13 +4,13 @@ package fr.polytechnique.cmap.cnam.study.fall.extractors import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.etl.events.{Drug, Event} -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.{DrugConfig, DrugExtractor} -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.families.Cardiac -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.level.TherapeuticLevel +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.families.Cardiac +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level.TherapeuticLevel +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.{DrugConfig, DrugExtractor} import fr.polytechnique.cmap.cnam.etl.sources.Sources object CardiacExtractor { def extract(sources: Sources): Dataset[Event[Drug]] = { - new DrugExtractor(DrugConfig(TherapeuticLevel, List(Cardiac))).extract(sources, Set.empty) + new DrugExtractor(DrugConfig(TherapeuticLevel, List(Cardiac))).extract(sources) } } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/ControlDrugs.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/ControlDrugs.scala index 346a6a05..337a0ed1 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/ControlDrugs.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/ControlDrugs.scala @@ -4,9 +4,9 @@ package fr.polytechnique.cmap.cnam.study.fall.extractors import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.etl.events.{Drug, Event} -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.families._ -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.level.TherapeuticLevel -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.{DrugConfig, DrugExtractor} +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.families._ +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level.TherapeuticLevel +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.{DrugConfig, DrugExtractor} import fr.polytechnique.cmap.cnam.etl.sources.Sources object ControlDrugs { @@ -17,6 +17,6 @@ object ControlDrugs { List(Antihypertenseurs, Opioids, Cardiac, ProtonPumpInhibitors, Antiepileptics) ) ) - .extract(sources, Set.empty) + .extract(sources) } } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/DiagnosisExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/DiagnosisExtractor.scala index b7386345..218281a7 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/DiagnosisExtractor.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/DiagnosisExtractor.scala @@ -4,7 +4,8 @@ package fr.polytechnique.cmap.cnam.study.fall.extractors import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.etl.events.{Diagnosis, Event} -import fr.polytechnique.cmap.cnam.etl.extractors.diagnoses._ +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.events.diagnoses._ import fr.polytechnique.cmap.cnam.etl.sources.Sources import fr.polytechnique.cmap.cnam.util.functions.unionDatasets @@ -12,9 +13,9 @@ class DiagnosisExtractor(config: DiagnosesConfig) { def extract(sources: Sources): Dataset[Event[Diagnosis]] = { - val mainDiag = MainDiagnosisFallExtractor.extract(sources, config.dpCodes.toSet) - val linkedDiag = LinkedDiagnosisFallExtractor.extract(sources, config.drCodes.toSet) - val dasDiag = AssociatedDiagnosisFallExtractor.extract(sources, config.daCodes.toSet) + val mainDiag = McoMainDiagnosisExtractor(SimpleExtractorCodes(config.dpCodes)).extract(sources) + val linkedDiag = McoLinkedDiagnosisExtractor(SimpleExtractorCodes(config.drCodes)).extract(sources) + val dasDiag = McoAssociatedDiagnosisExtractor(SimpleExtractorCodes(config.daCodes)).extract(sources) unionDatasets(mainDiag, linkedDiag, dasDiag) } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/DrugsExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/DrugsExtractor.scala index 7390f842..eb77da08 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/DrugsExtractor.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/DrugsExtractor.scala @@ -4,12 +4,11 @@ package fr.polytechnique.cmap.cnam.study.fall.extractors import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.etl.events.{Drug, Event} -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.{DrugConfig, DrugExtractor} +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.{DrugConfig, DrugExtractor} import fr.polytechnique.cmap.cnam.etl.sources.Sources class DrugsExtractor(drugConfig: DrugConfig) { - def extract(sources: Sources): Dataset[Event[Drug]] = { - new DrugExtractor(drugConfig).extract(sources, Set.empty) - } + def extract(sources: Sources): Dataset[Event[Drug]] = DrugExtractor(drugConfig).extract(sources) + } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/EpilepticsExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/EpilepticsExtractor.scala index 4584adad..4be0393c 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/EpilepticsExtractor.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/EpilepticsExtractor.scala @@ -4,16 +4,17 @@ package fr.polytechnique.cmap.cnam.study.fall.extractors import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.etl.events.{Diagnosis, Event} -import fr.polytechnique.cmap.cnam.etl.extractors.diagnoses.{ImbDiagnosisExtractor, McoLinkedDiagnosisExtractor, McoMainDiagnosisExtractor} +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.events.diagnoses.{ImbCimDiagnosisExtractor, McoLinkedDiagnosisExtractor, McoMainDiagnosisExtractor} import fr.polytechnique.cmap.cnam.etl.sources.Sources import fr.polytechnique.cmap.cnam.util.functions.unionDatasets object EpilepticsExtractor { def extract(sources: Sources): Dataset[Event[Diagnosis]] = { - val mainDiag = McoMainDiagnosisExtractor.extract(sources, Set("G40")) - val linkedDiag = McoLinkedDiagnosisExtractor.extract(sources, Set("G40")) - val imbDiag = ImbDiagnosisExtractor.extract(sources, Set("G40")) + val mainDiag = McoMainDiagnosisExtractor(SimpleExtractorCodes(List("G40"))).extract(sources) + val linkedDiag = McoLinkedDiagnosisExtractor(SimpleExtractorCodes(List("G40"))).extract(sources) + val imbDiag = ImbCimDiagnosisExtractor(SimpleExtractorCodes(List("G40"))).extract(sources) unionDatasets(mainDiag, linkedDiag, imbDiag) } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/FallHospitalStayExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/FallHospitalStayExtractor.scala new file mode 100644 index 00000000..e8a26b7c --- /dev/null +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/FallHospitalStayExtractor.scala @@ -0,0 +1,70 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.study.fall.extractors + +import java.sql.{Date, Timestamp} +import org.apache.spark.sql.Row +import fr.polytechnique.cmap.cnam.etl.events.{EventBuilder, HospitalStay, McoHospitalStay} +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.mco.McoSimpleExtractor +import fr.polytechnique.cmap.cnam.etl.extractors.IsInStrategy + +class FallHospitalStayExtractor(codes: SimpleExtractorCodes) extends McoSimpleExtractor[HospitalStay] + with IsInStrategy[HospitalStay] { + val exitCodes: (String) => (ExitMode) = { + case "0" => TransferAct + case "6" => Mutation + case "7" => Transfer + case "8" => Home + case "9" => Death + case _ => Unknown + } + + override def getCodes: SimpleExtractorCodes = codes + + override def columnName: String = ColNames.ExitMode + + override def eventBuilder: EventBuilder = McoHospitalStay + + override def neededColumns: List[String] = List(ColNames.EndDate, ColNames.ExitMode) ++ super.usedColumns + + override def extractEnd(r: Row): Option[Timestamp] = Some { + if (!r.isNullAt(r.fieldIndex(ColNames.EndDate))) { + new Timestamp(r.getAs[Date](ColNames.EndDate).getTime) + } + else { // This shouldn't happen, but some hospital stays come without an EndDate + extractStart(r) + } + + } + + override def extractValue(row: Row): String = exitCodes(row.getAs[String](columnName)).value +} + +sealed trait ExitMode extends Serializable { + def value: String +} + +object Death extends ExitMode { + override def value: String = "death" +} + +object Mutation extends ExitMode { + override def value: String = "mutation" +} + +object Transfer extends ExitMode { + override def value: String = "transfer" +} + +object Home extends ExitMode { + override def value: String = "home" +} + +object TransferAct extends ExitMode { + override def value: String = "transfer_act" +} + +object Unknown extends ExitMode { + override def value: String = "unknown" +} diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/HTAExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/HTAExtractor.scala index a1f54674..2b954cd7 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/HTAExtractor.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/HTAExtractor.scala @@ -4,13 +4,13 @@ package fr.polytechnique.cmap.cnam.study.fall.extractors import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.etl.events.{Drug, Event} -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.{DrugConfig, DrugExtractor} -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.families.Antihypertenseurs -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.level.TherapeuticLevel +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.families.Antihypertenseurs +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level.TherapeuticLevel +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.{DrugConfig, DrugExtractor} import fr.polytechnique.cmap.cnam.etl.sources.Sources object HTAExtractor { def extract(sources: Sources): Dataset[Event[Drug]] = { - new DrugExtractor(DrugConfig(TherapeuticLevel, List(Antihypertenseurs))).extract(sources, Set.empty) + new DrugExtractor(DrugConfig(TherapeuticLevel, List(Antihypertenseurs))).extract(sources) } } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/IPPExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/IPPExtractor.scala index abedb0ce..e346ab07 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/IPPExtractor.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/IPPExtractor.scala @@ -4,13 +4,13 @@ package fr.polytechnique.cmap.cnam.study.fall.extractors import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.etl.events.{Drug, Event} -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.{DrugConfig, DrugExtractor} -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.families.ProtonPumpInhibitors -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.level.MoleculeCombinationLevel +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.families.ProtonPumpInhibitors +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level.TherapeuticLevel +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.{DrugConfig, DrugExtractor} import fr.polytechnique.cmap.cnam.etl.sources.Sources object IPPExtractor { def extract(sources: Sources): Dataset[Event[Drug]] = { - new DrugExtractor(DrugConfig(MoleculeCombinationLevel, List(ProtonPumpInhibitors))).extract(sources, Set.empty) + new DrugExtractor(DrugConfig(TherapeuticLevel, List(ProtonPumpInhibitors))).extract(sources) } } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/McoDiagnosisExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/McoDiagnosisExtractor.scala deleted file mode 100644 index b564e352..00000000 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/McoDiagnosisExtractor.scala +++ /dev/null @@ -1,27 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.study.fall.extractors - -import org.apache.spark.sql.Row -import fr.polytechnique.cmap.cnam.etl.events.Diagnosis -import fr.polytechnique.cmap.cnam.etl.extractors.diagnoses.{McoAssociatedDiagnosisExtractor, McoLinkedDiagnosisExtractor, McoMainDiagnosisExtractor} -import fr.polytechnique.cmap.cnam.etl.extractors.mco.McoExtractor -import fr.polytechnique.cmap.cnam.study.fall.fractures.Surgery - -trait ClassifyWeight extends McoExtractor[Diagnosis] with Surgery { - override def extractWeight(r: Row): Double = { - if (!r.isNullAt(r.fieldIndex(ColNames.ExitMode)) && getExit(r).equals("9")) { - 4 - } else if (!r.isNullAt(r.fieldIndex(ColNames.CCAM)) && codes.contains(r.getAs[String](ColNames.CCAM))) { - 3 - } else { - 2 - } - } -} - -object MainDiagnosisFallExtractor extends McoMainDiagnosisExtractor with ClassifyWeight - -object AssociatedDiagnosisFallExtractor extends McoAssociatedDiagnosisExtractor with ClassifyWeight - -object LinkedDiagnosisFallExtractor extends McoLinkedDiagnosisExtractor with ClassifyWeight diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/OpioidsExtractor.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/OpioidsExtractor.scala index bc43495a..ee7286e3 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/OpioidsExtractor.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/OpioidsExtractor.scala @@ -4,13 +4,13 @@ package fr.polytechnique.cmap.cnam.study.fall.extractors import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.etl.events.{Drug, Event} -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.level.MoleculeCombinationLevel -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.families.Opioids -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.{DrugConfig, DrugExtractor} +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.families.Opioids +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level.TherapeuticLevel +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.{DrugConfig, DrugExtractor} import fr.polytechnique.cmap.cnam.etl.sources.Sources object OpioidsExtractor { def extract(sources: Sources): Dataset[Event[Drug]] = { - new DrugExtractor(DrugConfig(MoleculeCombinationLevel, List(Opioids))).extract(sources, Set.empty) + new DrugExtractor(DrugConfig(TherapeuticLevel, List(Opioids))).extract(sources) } } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/FracturesTransformer.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/FracturesTransformer.scala index 082b7cc8..7054287d 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/FracturesTransformer.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/FracturesTransformer.scala @@ -22,10 +22,18 @@ class FracturesTransformer(config: FallConfig) extends OutcomesTransformer with def transform( liberalActs: Dataset[Event[MedicalAct]], acts: Dataset[Event[MedicalAct]], - diagnoses: Dataset[Event[Diagnosis]]): Dataset[Event[Outcome]] = { + diagnoses: Dataset[Event[Diagnosis]], + surgeries: Dataset[Event[MedicalAct]], + hospitalDeaths: Dataset[Event[HospitalStay]]): Dataset[Event[Outcome]] = { // Hospitalized fractures - val hospitalizedFractures = HospitalizedFractures.transform(diagnoses, acts, config.sites.sites) + val hospitalizedFractures = HospitalizedFractures.transform( + diagnoses, + acts.filter(_.category == McoCCAMAct.category), + hospitalDeaths, + surgeries, + config.sites.sites + ) // Liberal Fractures val liberalFractures = LiberalFractures.transform(liberalActs) diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/HospitalizedFractures.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/HospitalizedFractures.scala index 32f0fa61..1dc314e2 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/HospitalizedFractures.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/HospitalizedFractures.scala @@ -2,50 +2,25 @@ package fr.polytechnique.cmap.cnam.study.fall.fractures +import org.apache.spark.sql.Dataset import org.apache.spark.sql.functions._ -import org.apache.spark.sql.{Dataset, SparkSession} import fr.polytechnique.cmap.cnam.etl.events._ import fr.polytechnique.cmap.cnam.etl.transformers.outcomes.OutcomesTransformer import fr.polytechnique.cmap.cnam.study.fall.codes.FractureCodes +import fr.polytechnique.cmap.cnam.study.fall.extractors.Death +import fr.polytechnique.cmap.cnam.util.functions.unionDatasets /* * The rules for this Outcome definition can be found on the following page: * https://datainitiative.atlassian.net/wiki/spaces/CFC/pages/61282101/General+fractures+Fall+study */ -case class HospitalStayID(patientID: String, id: String) +case class HospitalStayID(patientID: String, groupID: String) object HospitalizedFractures extends OutcomesTransformer with FractureCodes { override val outcomeName: String = "hospitalized_fall" - def transform( - diagnoses: Dataset[Event[Diagnosis]], - acts: Dataset[Event[MedicalAct]], ghmSites: List[BodySite]): Dataset[Event[Outcome]] = { - - import diagnoses.sqlContext.implicits._ - val ghmCodes = BodySite.extractCIM10CodesFromSites(ghmSites) - val correctCIM10Event = diagnoses - .filter(diagnosis => isFractureDiagnosis(diagnosis, ghmCodes)) - - val incorrectGHMStays = acts - .filter(isBadGHM _) - .map(event => HospitalStayID(event.patientID, event.groupID)) - .distinct() - - filterHospitalStay(correctCIM10Event, incorrectGHMStays) - .map( - event => Outcome( - event.patientID, - BodySite.getSiteFromCode(event.value, ghmSites, CodeType.CIM10), - outcomeName, - event.weight, - event.start - ) - ) - - } - def isFractureDiagnosis(event: Event[Diagnosis], ghmSites: List[String]): Boolean = { isInCodeList(event, ghmSites.toSet) } @@ -59,17 +34,16 @@ object HospitalizedFractures extends OutcomesTransformer with FractureCodes { } /** - * filters diagnosis that do not have a DP in the same hospital stay - * and the diagnosis that relates to an incorrectGHMStay + * Filter out Diagnoses which do not have a MainDiagnosis during the same HospitalStay that is Fracture Diagnosis. + * + * @param diagnoses Fracture Diagnoses with DP, DA and DR diagnoses. + * @return Diagnoses with a DP in the same hospital stay that is a fracture diagnosis. */ - def filterHospitalStay( - events: Dataset[Event[Diagnosis]], - incorrectGHMStays: Dataset[HospitalStayID]) - : Dataset[Event[Diagnosis]] = { - - val spark: SparkSession = events.sparkSession - import spark.implicits._ - val fracturesDiagnoses = events + def filterDiagnosesWithoutDP(diagnoses: Dataset[Event[Diagnosis]]): Dataset[Event[Diagnosis]] = { + + import diagnoses.sparkSession.implicits._ + + diagnoses .groupByKey(_.groupID) .flatMapGroups { case (_, diagnoses) => val diagnosisStream = diagnoses.toStream @@ -78,17 +52,115 @@ object HospitalizedFractures extends OutcomesTransformer with FractureCodes { } else { Seq.empty } - }.toDF() + } + } + /** + * Get the ID of Hospital Stays that are mainly for fracture followup such as plaster removal. + * + * @param acts Contains all CCAM codes Events from different sources. + * @return Hospital Stays id for fracture followup. + */ + def getFractureFollowUpStays(acts: Dataset[Event[MedicalAct]]): Dataset[HospitalStayID] = { + import acts.sparkSession.implicits._ + + acts + .filter(_.category == McoCCAMAct.category) + .filter(isBadGHM _) + .map(event => HospitalStayID(event.patientID, event.groupID)) + .distinct() + } - val patientsToFilter = incorrectGHMStays.select("patientID") + /** + * Filter out Diagnosis who share a groupID in the followUpStaysForFractures. + * + * @param fracturesDiagnoses Dataset of fracture diagnosis. + * @param followUpStaysForFractures Dataset of hospital stays for followup of fractures. + * @return + */ + def filterDiagnosisForFracturesFollowUp(followUpStaysForFractures: Dataset[HospitalStayID]) + (fracturesDiagnoses: Dataset[Event[Diagnosis]]): Dataset[Event[Diagnosis]] = { + import fracturesDiagnoses.sparkSession.implicits._ fracturesDiagnoses - .join(broadcast(patientsToFilter), Seq("patientID"), "left_anti") - .as[Event[Diagnosis]] + .joinWith( + broadcast(followUpStaysForFractures), + fracturesDiagnoses(Event.Columns.PatientID) === followUpStaysForFractures("patientID") + && fracturesDiagnoses(Event.Columns.GroupID) === followUpStaysForFractures("groupID"), + "left_outer" + ) + .filter(_._2 == null) + .map(_._1) + } + + def getFourthLevelSeverity(stays: Dataset[Event[HospitalStay]]) + (diagnoses: Dataset[Event[Diagnosis]]): Dataset[Event[Diagnosis]] = { + import stays.sparkSession.implicits._ + diagnoses.joinWith( + stays.filter(_.value == Death.value), + diagnoses(Event.Columns.PatientID) === stays(Event.Columns.PatientID) + && diagnoses(Event.Columns.GroupID) === stays(Event.Columns.GroupID), + "inner" + ) + .map(_._1) } - def isMainOrDASDiagnosis(event: Event[Diagnosis]): Boolean = { - event.category == McoMainDiagnosis.category || event.category == McoAssociatedDiagnosis.category + def getThirdLevelSeverity(surgeries: Dataset[Event[MedicalAct]]) + (diagnoses: Dataset[Event[Diagnosis]]): Dataset[Event[Diagnosis]] = { + import surgeries.sparkSession.implicits._ + diagnoses.joinWith( + surgeries, + diagnoses(Event.Columns.PatientID) === surgeries(Event.Columns.PatientID) + && diagnoses(Event.Columns.GroupID) === surgeries(Event.Columns.GroupID), + "inner" + ) + .map(_._1) } + def assignSeverityToDiagnosis(stays: Dataset[Event[HospitalStay]], surgeries: Dataset[Event[MedicalAct]]) + (diagnoses: Dataset[Event[Diagnosis]]): Dataset[Event[Diagnosis]] = { + + val fourthLevelSeverity = diagnoses.transform(getFourthLevelSeverity(stays)).cache() + + val notFourthLevel = diagnoses.except(fourthLevelSeverity).cache() + val thirdLevelSeverity = notFourthLevel + .transform(getThirdLevelSeverity(surgeries)).cache() + + val secondLevelSeverity = notFourthLevel.except(thirdLevelSeverity) + import surgeries.sparkSession.implicits._ + unionDatasets( + fourthLevelSeverity.map(_.copy(weight = 4D)), + thirdLevelSeverity.map(_.copy(weight = 3D)), + secondLevelSeverity.map(_.copy(weight = 2D)) + ) + } + + def transform( + diagnoses: Dataset[Event[Diagnosis]], + acts: Dataset[Event[MedicalAct]], + stays: Dataset[Event[HospitalStay]], + surgeries: Dataset[Event[MedicalAct]], + ghmSites: List[BodySite] + ): Dataset[Event[Outcome]] = { + + import diagnoses.sqlContext.implicits._ + val ghmCodes = BodySite.extractCIM10CodesFromSites(ghmSites) + val diagnosisWithDP = diagnoses.filter(diagnosis => isFractureDiagnosis(diagnosis, ghmCodes)) + .transform(filterDiagnosesWithoutDP) + val fractureFollowUpHospitalStays = getFractureFollowUpStays(acts) + + + diagnosisWithDP + .transform(filterDiagnosisForFracturesFollowUp(fractureFollowUpHospitalStays)) + .transform(assignSeverityToDiagnosis(stays, surgeries)) + .map( + event => Outcome( + event.patientID, + BodySite.getSiteFromCode(event.value, ghmSites, CodeType.CIM10), + outcomeName, + event.weight, + event.start + ) + ) + + } } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/LiberalFractures.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/LiberalFractures.scala index 7b1979bc..6aaf12ad 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/LiberalFractures.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/LiberalFractures.scala @@ -17,7 +17,7 @@ object LiberalFractures extends OutcomesTransformer { .map( event => { val fractureSite = BodySite.getSiteFromCode(event.value, BodySites.sites, CodeType.CCAM) - Outcome(event.patientID, fractureSite, outcomeName, event.weight, event.start) + Outcome(event.patientID, fractureSite, outcomeName, 1.0D, event.start) } ) } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/PrivateAmbulatoryFractures.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/PrivateAmbulatoryFractures.scala index 0411589c..ab877eb3 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/PrivateAmbulatoryFractures.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/PrivateAmbulatoryFractures.scala @@ -25,7 +25,7 @@ object PrivateAmbulatoryFractures extends OutcomesTransformer with FractureCodes .map( event => { val fractureSite = BodySite.getSiteFromCode(event.value, BodySites.sites, CodeType.CCAM) - Outcome(event.patientID, fractureSite, outcomeName, event.weight, event.start) + Outcome(event.patientID, fractureSite, outcomeName, 1.0D, event.start) } ) } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/PublicAmbulatoryFractures.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/PublicAmbulatoryFractures.scala index 48fea1ef..88ab8b6b 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/PublicAmbulatoryFractures.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/PublicAmbulatoryFractures.scala @@ -20,13 +20,13 @@ object PublicAmbulatoryFractures extends OutcomesTransformer with FractureCodes .map( event => { val fractureSite = BodySite.getSiteFromCode(event.value, BodySites.sites, CodeType.CCAM) - Outcome(event.patientID, fractureSite, outcomeName, event.weight, event.start) + Outcome(event.patientID, fractureSite, outcomeName, 1.0D, event.start) } ) } def isPublicAmbulatory(event: Event[MedicalAct]): Boolean = { - event.category == McoCEAct.category + event.category == McoCeCcamAct.category } def containsNonHospitalizedCcam(event: Event[MedicalAct]): Boolean = { diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/Surgery.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/Surgery.scala index c5689960..c0b050d1 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/Surgery.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/Surgery.scala @@ -2,8 +2,8 @@ package fr.polytechnique.cmap.cnam.study.fall.fractures -trait Surgery { - val codes = Set( +object Surgery { + val surgeryCodes = Set( "QAGA004", "QZGA003", "EEGA002", diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/pioglitazone/PioglitazoneConfig.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/pioglitazone/PioglitazoneConfig.scala index b92da34a..3e801134 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/pioglitazone/PioglitazoneConfig.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/pioglitazone/PioglitazoneConfig.scala @@ -3,14 +3,13 @@ package fr.polytechnique.cmap.cnam.study.pioglitazone import java.time.LocalDate -import pureconfig.generic.auto._ -import me.danielpes.spark.datetime.Period import me.danielpes.spark.datetime.implicits._ +import pureconfig.generic.auto._ import fr.polytechnique.cmap.cnam.etl.config.study.StudyConfig import fr.polytechnique.cmap.cnam.etl.config.{BaseConfig, ConfigLoader} -import fr.polytechnique.cmap.cnam.etl.extractors.acts.MedicalActsConfig -import fr.polytechnique.cmap.cnam.etl.extractors.diagnoses.DiagnosesConfig -import fr.polytechnique.cmap.cnam.etl.extractors.molecules.MoleculePurchasesConfig +import fr.polytechnique.cmap.cnam.etl.extractors.events.acts.MedicalActsConfig +import fr.polytechnique.cmap.cnam.etl.extractors.events.diagnoses.DiagnosesConfig +import fr.polytechnique.cmap.cnam.etl.extractors.events.molecules.MoleculePurchasesConfig import fr.polytechnique.cmap.cnam.etl.extractors.patients.PatientsConfig import fr.polytechnique.cmap.cnam.etl.transformers.exposures._ import fr.polytechnique.cmap.cnam.etl.transformers.follow_up.FollowUpTransformerConfig diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/pioglitazone/PioglitazoneMain.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/pioglitazone/PioglitazoneMain.scala index 38b73769..fdc56f13 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/pioglitazone/PioglitazoneMain.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/pioglitazone/PioglitazoneMain.scala @@ -8,10 +8,9 @@ import scala.collection.mutable.ListBuffer import org.apache.spark.sql.{Dataset, SQLContext} import fr.polytechnique.cmap.cnam.Main import fr.polytechnique.cmap.cnam.etl.events._ -import fr.polytechnique.cmap.cnam.etl.extractors.hospitalstays.McoHospitalStaysExtractor -import fr.polytechnique.cmap.cnam.etl.extractors.molecules.MoleculePurchases -import fr.polytechnique.cmap.cnam.etl.extractors.patients.Patients -import fr.polytechnique.cmap.cnam.etl.extractors.tracklosses.{Tracklosses, TracklossesConfig} +import fr.polytechnique.cmap.cnam.etl.extractors.events.hospitalstays.McoHospitalStaysExtractor +import fr.polytechnique.cmap.cnam.etl.extractors.events.molecules.MoleculePurchases +import fr.polytechnique.cmap.cnam.etl.extractors.patients.AllPatientExtractor import fr.polytechnique.cmap.cnam.etl.filters.PatientFilters import fr.polytechnique.cmap.cnam.etl.implicits import fr.polytechnique.cmap.cnam.etl.patients.Patient @@ -19,6 +18,8 @@ import fr.polytechnique.cmap.cnam.etl.sources.Sources import fr.polytechnique.cmap.cnam.etl.transformers.exposures.ExposureTransformer import fr.polytechnique.cmap.cnam.etl.transformers.follow_up.FollowUpTransformer import fr.polytechnique.cmap.cnam.etl.transformers.observation.ObservationPeriodTransformer +import fr.polytechnique.cmap.cnam.etl.transformers.patients.PatientFilters +import fr.polytechnique.cmap.cnam.etl.transformers.tracklosses.{TracklossTransformer, TracklossesConfig} import fr.polytechnique.cmap.cnam.study.pioglitazone.extractors.{Diagnoses, MedicalActs} import fr.polytechnique.cmap.cnam.study.pioglitazone.outcomes._ import fr.polytechnique.cmap.cnam.util.datetime.implicits._ @@ -50,7 +51,7 @@ object PioglitazoneMain extends Main { val sources = Sources.sanitize(sqlContext.readSources(config.input)) // Extraction: get all events - val rawPatients: Dataset[Patient] = new Patients(config.patients).extract(sources).cache() + val rawPatients: Dataset[Patient] = AllPatientExtractor.extract(sources).cache() operationsMetadata += { OperationReporter .report( @@ -63,6 +64,19 @@ object PioglitazoneMain extends Main { ) } + val filteredPatients: Dataset[Patient] = new PatientFilters(config.patients).filterPatients(rawPatients).cache() + operationsMetadata += { + OperationReporter + .report( + "filtered_subjects", + List("DCIR", "MCO", "IR_BEN_R"), + OperationTypes.Patients, + filteredPatients.toDF, + Path(config.output.outputSavePath), + config.output.saveMode + ) + } + val rawDrugPurchases: Dataset[Event[Molecule]] = new MoleculePurchases(config.molecules).extract(sources).cache() operationsMetadata += { OperationReporter @@ -104,7 +118,7 @@ object PioglitazoneMain extends Main { val rawTracklosses = { val tracklossConfig = TracklossesConfig(studyEnd = config.base.studyEnd) - new Tracklosses(tracklossConfig).extract(sources).cache() + new TracklossTransformer(tracklossConfig).transform(rawDrugPurchases).cache() } operationsMetadata += { OperationReporter @@ -118,7 +132,7 @@ object PioglitazoneMain extends Main { ) } - val rawHospitalStays = McoHospitalStaysExtractor.extract(sources, Set.empty).cache() + val rawHospitalStays = McoHospitalStaysExtractor.extract(sources).cache() operationsMetadata += { OperationReporter.report( "extract_hospital_stays", diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/pioglitazone/extractors/Diagnoses.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/pioglitazone/extractors/Diagnoses.scala index 68870447..e4076e8e 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/pioglitazone/extractors/Diagnoses.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/pioglitazone/extractors/Diagnoses.scala @@ -4,7 +4,8 @@ package fr.polytechnique.cmap.cnam.study.pioglitazone.extractors import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.etl.events.{Diagnosis, Event} -import fr.polytechnique.cmap.cnam.etl.extractors.diagnoses._ +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.events.diagnoses._ import fr.polytechnique.cmap.cnam.etl.sources.Sources import fr.polytechnique.cmap.cnam.util.functions @@ -12,9 +13,9 @@ class Diagnoses(config: DiagnosesConfig) { def extract(sources: Sources): Dataset[Event[Diagnosis]] = { - val mainDiag = McoMainDiagnosisExtractor.extract(sources, config.dpCodes.toSet) - val linkedDiag = McoLinkedDiagnosisExtractor.extract(sources, config.drCodes.toSet) - val associatedDiag = McoAssociatedDiagnosisExtractor.extract(sources, config.daCodes.toSet) + val mainDiag = McoMainDiagnosisExtractor(SimpleExtractorCodes(config.dpCodes)).extract(sources) + val linkedDiag = McoLinkedDiagnosisExtractor(SimpleExtractorCodes(config.drCodes)).extract(sources) + val associatedDiag = McoAssociatedDiagnosisExtractor(SimpleExtractorCodes(config.daCodes)).extract(sources) //val imbDiag = ImbDiagnosisExtractor.extract(sources, config.imbCodes.toSet) functions.unionDatasets(mainDiag, linkedDiag, associatedDiag)//, imbDiag) } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/pioglitazone/extractors/MedicalActs.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/pioglitazone/extractors/MedicalActs.scala index d5ff5062..6457a200 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/pioglitazone/extractors/MedicalActs.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/pioglitazone/extractors/MedicalActs.scala @@ -4,17 +4,18 @@ package fr.polytechnique.cmap.cnam.study.pioglitazone.extractors import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.etl.events.{Event, MedicalAct} -import fr.polytechnique.cmap.cnam.etl.extractors.acts._ +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.events.acts._ import fr.polytechnique.cmap.cnam.etl.sources.Sources import fr.polytechnique.cmap.cnam.util.functions.unionDatasets class MedicalActs(config: MedicalActsConfig) { def extract(sources: Sources): Dataset[Event[MedicalAct]] = { - val dcirActs = DcirMedicalActExtractor.extract(sources, config.dcirCodes.toSet) - val ccamActs = McoCcamActExtractor.extract(sources, config.mcoCCAMCodes.toSet) - val cimActs = McoCimMedicalActExtractor.extract(sources, config.mcoCIMCodes.toSet) + val dcirActs = DcirMedicalActExtractor(SimpleExtractorCodes(config.dcirCodes)).extract(sources) + val ccamActs = McoCcamActExtractor(SimpleExtractorCodes(config.mcoCCAMCodes)).extract(sources) + //val cimActs = McoCimMedicalActExtractor(BaseExtractorCodes(config.mcoCIMCodes)).extract(sources) - unionDatasets(dcirActs, ccamActs, cimActs) + unionDatasets(dcirActs, ccamActs) //, cimActs } } diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/rosiglitazone/RosiglitazoneConfig.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/rosiglitazone/RosiglitazoneConfig.scala index 7797e179..12dac1df 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/rosiglitazone/RosiglitazoneConfig.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/rosiglitazone/RosiglitazoneConfig.scala @@ -3,12 +3,12 @@ package fr.polytechnique.cmap.cnam.study.rosiglitazone import java.time.LocalDate -import pureconfig.generic.auto._ import me.danielpes.spark.datetime.implicits._ +import pureconfig.generic.auto._ import fr.polytechnique.cmap.cnam.etl.config.study.StudyConfig import fr.polytechnique.cmap.cnam.etl.config.{BaseConfig, ConfigLoader} -import fr.polytechnique.cmap.cnam.etl.extractors.diagnoses.DiagnosesConfig -import fr.polytechnique.cmap.cnam.etl.extractors.molecules.MoleculePurchasesConfig +import fr.polytechnique.cmap.cnam.etl.extractors.events.diagnoses.DiagnosesConfig +import fr.polytechnique.cmap.cnam.etl.extractors.events.molecules.MoleculePurchasesConfig import fr.polytechnique.cmap.cnam.etl.extractors.patients.PatientsConfig import fr.polytechnique.cmap.cnam.etl.transformers.exposures._ import fr.polytechnique.cmap.cnam.etl.transformers.follow_up.FollowUpTransformerConfig diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/rosiglitazone/RosiglitazoneMain.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/rosiglitazone/RosiglitazoneMain.scala index 1455c8f3..af0ab02b 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/rosiglitazone/RosiglitazoneMain.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/rosiglitazone/RosiglitazoneMain.scala @@ -7,10 +7,9 @@ import scala.collection.mutable.ListBuffer import org.apache.spark.sql.{Dataset, SQLContext} import fr.polytechnique.cmap.cnam.Main import fr.polytechnique.cmap.cnam.etl.events._ -import fr.polytechnique.cmap.cnam.etl.extractors.hospitalstays.McoHospitalStaysExtractor -import fr.polytechnique.cmap.cnam.etl.extractors.molecules.MoleculePurchases -import fr.polytechnique.cmap.cnam.etl.extractors.patients.Patients -import fr.polytechnique.cmap.cnam.etl.extractors.tracklosses.{Tracklosses, TracklossesConfig} +import fr.polytechnique.cmap.cnam.etl.extractors.events.hospitalstays.McoHospitalStaysExtractor +import fr.polytechnique.cmap.cnam.etl.extractors.events.molecules.MoleculePurchases +import fr.polytechnique.cmap.cnam.etl.extractors.patients.AllPatientExtractor import fr.polytechnique.cmap.cnam.etl.filters.PatientFilters import fr.polytechnique.cmap.cnam.etl.implicits import fr.polytechnique.cmap.cnam.etl.patients.Patient @@ -18,6 +17,8 @@ import fr.polytechnique.cmap.cnam.etl.sources.Sources import fr.polytechnique.cmap.cnam.etl.transformers.exposures.ExposureTransformer import fr.polytechnique.cmap.cnam.etl.transformers.follow_up.FollowUpTransformer import fr.polytechnique.cmap.cnam.etl.transformers.observation.ObservationPeriodTransformer +import fr.polytechnique.cmap.cnam.etl.transformers.patients.PatientFilters +import fr.polytechnique.cmap.cnam.etl.transformers.tracklosses.{TracklossTransformer, TracklossesConfig} import fr.polytechnique.cmap.cnam.study.rosiglitazone.extractors.Diagnoses import fr.polytechnique.cmap.cnam.study.rosiglitazone.outcomes.RosiglitazoneOutcomeTransformer import fr.polytechnique.cmap.cnam.util.Path @@ -45,10 +46,23 @@ object RosiglitazoneMain extends Main { val sources = Sources.sanitize(sqlContext.readSources(config.input)) //Extracting Patients - val patients: Dataset[Patient] = new Patients(config.patients).extract(sources).cache() + val rawpatients: Dataset[Patient] = AllPatientExtractor.extract(sources).cache() operationsMetadata += { OperationReporter.report( - "extract_patients", + "extract_raw_patients", + List("DCIR", "MCO", "IR_BEN_R"), + OperationTypes.Patients, + rawpatients.toDF(), + Path(config.output.outputSavePath), + config.output.saveMode + ) + } + + //Extracting Patients + val patients: Dataset[Patient] = new PatientFilters(config.patients).filterPatients(rawpatients).cache() + operationsMetadata += { + OperationReporter.report( + "extract_filtered_patients", List("DCIR", "MCO", "IR_BEN_R"), OperationTypes.Patients, patients.toDF(), @@ -85,7 +99,7 @@ object RosiglitazoneMain extends Main { ) } - val hospitalStays = McoHospitalStaysExtractor.extract(sources, Set.empty).cache() + val hospitalStays = McoHospitalStaysExtractor.extract(sources).cache() operationsMetadata += { OperationReporter.report( "extract_hospital_stays", @@ -119,7 +133,7 @@ object RosiglitazoneMain extends Main { //Extract Trackloss val tracklosses = { val tracklossConfig = TracklossesConfig(studyEnd = config.base.studyEnd) - new Tracklosses(tracklossConfig).extract(sources).cache() + new TracklossTransformer(tracklossConfig).transform(drugPurchases).cache() } operationsMetadata += { OperationReporter diff --git a/src/main/scala/fr/polytechnique/cmap/cnam/study/rosiglitazone/extractors/Diagnoses.scala b/src/main/scala/fr/polytechnique/cmap/cnam/study/rosiglitazone/extractors/Diagnoses.scala index c345e815..5fe80f52 100644 --- a/src/main/scala/fr/polytechnique/cmap/cnam/study/rosiglitazone/extractors/Diagnoses.scala +++ b/src/main/scala/fr/polytechnique/cmap/cnam/study/rosiglitazone/extractors/Diagnoses.scala @@ -4,7 +4,8 @@ package fr.polytechnique.cmap.cnam.study.rosiglitazone.extractors import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.etl.events.{Diagnosis, Event} -import fr.polytechnique.cmap.cnam.etl.extractors.diagnoses._ +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.events.diagnoses._ import fr.polytechnique.cmap.cnam.etl.sources.Sources import fr.polytechnique.cmap.cnam.util.functions @@ -12,9 +13,9 @@ class Diagnoses(config: DiagnosesConfig) { def extract(sources: Sources): Dataset[Event[Diagnosis]] = { - val mainDiag = McoMainDiagnosisExtractor.extract(sources, config.dpCodes.toSet) - val linkedDiag = McoLinkedDiagnosisExtractor.extract(sources, config.drCodes.toSet) - val associatedDiag = McoAssociatedDiagnosisExtractor.extract(sources, config.daCodes.toSet) + val mainDiag = McoMainDiagnosisExtractor(SimpleExtractorCodes(config.dpCodes)).extract(sources) + val linkedDiag = McoLinkedDiagnosisExtractor(SimpleExtractorCodes(config.drCodes)).extract(sources) + val associatedDiag = McoAssociatedDiagnosisExtractor(SimpleExtractorCodes(config.daCodes)).extract(sources) functions.unionDatasets(mainDiag, linkedDiag, associatedDiag) } diff --git a/src/test/resources/PMSI/T_MCOaa_nnFBSTC.csv b/src/test/resources/PMSI/T_MCOaa_nnFBSTC.csv new file mode 100644 index 00000000..04cc8e84 --- /dev/null +++ b/src/test/resources/PMSI/T_MCOaa_nnFBSTC.csv @@ -0,0 +1,4 @@ +ACT_COD,ACT_COE,ACT_DNB,ACT_NBR,AMC_MNR,AMO_MNR,COEF_MCO,DEL_DAT_ENT,ETA_NUM,ETA_NUM_GEO,EXE_SPE,EXO_TM,HON_MNT,NUM_FAC,PRI_UNI,PSH_DMT,PSH_MDT,REM_BAS,REM_TAU,RSF_TYP,SEQ_NUM,SOR_ANN,SOR_MOI,TYP_ART,TYP_FPI +ABC,1.0,47,13,7083244.63,74666912.92,90593.7962,87701,390780146,UKObWJm,1,JrmsfEVr,12015701.82,yKWJO,6751465.02,dftrehpdSXgmPYxKLw,HtmEEPoJZNUYjK,52232224.74,530,ogD,00064268,2014,odz,I,FcpMYDQXuLXXicz +ABG,42.0,54,604,9248747.63,76316698.24,12890.6905,27754,190000059,LZqDvhEnofoP,,BNBN,74937208.48,ededzd,7342024.15,MlFnnCA,CcxCwN,96066675.71,867,mpN,00022621,2014,qVsIPuQ,X,wEcWCNecLDJcashSJO +ACO,,100,92,4013746.75,60935884.85,39460.6247,38073,390780146,OJqUNrNnJtHLiRXG,22,cuwUELVG,10034714.74,qEAWmFaaUVWUKgW,2651112.13,RTtAdKMS,nON,81328743.81,397,WXYr,00114237,2014,XW,H,Gavod diff --git a/src/test/resources/PMSI/T_MCOaa_nnFCSTC.csv b/src/test/resources/PMSI/T_MCOaa_nnFCSTC.csv new file mode 100644 index 00000000..76b77f5c --- /dev/null +++ b/src/test/resources/PMSI/T_MCOaa_nnFCSTC.csv @@ -0,0 +1,4 @@ +ACT_COD,ACT_COE,ACT_DNB,ACT_NBR,AMC_MNR,AMO_MNR,COEF_MCO,CONSULT_MIG,DEL_DAT_ENT,ETA_NUM,ETA_NUM_GEO,EXE_SPE,EXO_TM,HON_MNT,NUM_FAC,PRI_UNI,PSH_DMT,PSH_MDT,REM_BAS,REM_TAU,RSF_TYP,SEQ_NUM,SOR_ANN,SOR_MOI,TYP_ART,DAT_RET,NIR_ANO_17,NIAS_RET,SEX_RET,NOE_MNR,NOE_OPE,NIR_RET +ADE,802770.97,95,754,909104.28,19056776.78,30247.1354,zwJmWpuZRvEbySDeO,14775,390780146,AoEhdAjLtQjD,1,y,31336037.18,wHtQPioDLBPGAsG,3261939.8,PZVV,SgTeuUTsBAPP,9674004.7,294,eDLOvjVzKzLFh,00114237,KMMzOCcn,BXIyjqQw,X,0,uJNf,0,0,44091236,435,0 +ADC,420416.2,28,124,8879822.76,68200463.67,42206.718,BGvBlvvFSRkVp,8328,710780214,EPngKCorUZHRrtL,25,ypirRYVD,14755336.76,TsMNRhvuyBzmgsf,6242542.15,XkraU,OfEarDFRf,38840813.17,528,iVsWLTHURmVBRHHGPYR,00000130,Hfc,x,H,0,sUzrFNwAIWJBq,0,0,6701917,399,0 +A F,126936.43,4,19,5342621.25,41563748.97,9526.5076,wuZAhfcFNbJgrACI,84179,390780146,TMCbZywqMYisk,13,HCIF,59253505.42,efefz,1203645.95,qDjlCyzADWhx,oojjbj,56565841.8,371,tjyoD,00026744,eyNxA,KryZYbNDd,C,0,rTNotODgUKNtpBBhfEm,0,0,28198642,955,0 diff --git a/src/test/resources/test-input/DCIR.parquet b/src/test/resources/test-input/DCIR.parquet index bd269990..07db2023 100644 Binary files a/src/test/resources/test-input/DCIR.parquet and b/src/test/resources/test-input/DCIR.parquet differ diff --git a/src/test/resources/test-input/DCIR_w_BIO.parquet b/src/test/resources/test-input/DCIR_w_BIO.parquet index 8761db06..bd269990 100644 Binary files a/src/test/resources/test-input/DCIR_w_BIO.parquet and b/src/test/resources/test-input/DCIR_w_BIO.parquet differ diff --git a/src/test/resources/test-input/HAD.parquet b/src/test/resources/test-input/HAD.parquet index 24095b42..bbc3928c 100644 Binary files a/src/test/resources/test-input/HAD.parquet and b/src/test/resources/test-input/HAD.parquet differ diff --git a/src/test/resources/test-input/MCO_CE.parquet b/src/test/resources/test-input/MCO_CE.parquet index faa33317..14be81db 100644 Binary files a/src/test/resources/test-input/MCO_CE.parquet and b/src/test/resources/test-input/MCO_CE.parquet differ diff --git a/src/test/resources/test-input/SSR.parquet b/src/test/resources/test-input/SSR.parquet new file mode 100644 index 00000000..9e04cbdc Binary files /dev/null and b/src/test/resources/test-input/SSR.parquet differ diff --git a/src/test/resources/test-joined/SSR.parquet/.part-00000-38c76b26-d209-44f7-a863-d035d32207e6-c000.snappy.parquet.crc b/src/test/resources/test-joined/SSR.parquet/.part-00000-38c76b26-d209-44f7-a863-d035d32207e6-c000.snappy.parquet.crc deleted file mode 100644 index cc1ee613..00000000 Binary files a/src/test/resources/test-joined/SSR.parquet/.part-00000-38c76b26-d209-44f7-a863-d035d32207e6-c000.snappy.parquet.crc and /dev/null differ diff --git a/src/test/resources/test-joined/SSR.parquet/.part-00000-4659fc3c-6a63-4ab1-bcfe-620d052a165a-c000.snappy.parquet.crc b/src/test/resources/test-joined/SSR.parquet/.part-00000-4659fc3c-6a63-4ab1-bcfe-620d052a165a-c000.snappy.parquet.crc new file mode 100644 index 00000000..670fab51 Binary files /dev/null and b/src/test/resources/test-joined/SSR.parquet/.part-00000-4659fc3c-6a63-4ab1-bcfe-620d052a165a-c000.snappy.parquet.crc differ diff --git a/src/test/resources/test-joined/SSR.parquet/part-00000-38c76b26-d209-44f7-a863-d035d32207e6-c000.snappy.parquet b/src/test/resources/test-joined/SSR.parquet/part-00000-38c76b26-d209-44f7-a863-d035d32207e6-c000.snappy.parquet deleted file mode 100644 index 6f7ba11f..00000000 Binary files a/src/test/resources/test-joined/SSR.parquet/part-00000-38c76b26-d209-44f7-a863-d035d32207e6-c000.snappy.parquet and /dev/null differ diff --git a/src/test/resources/test-joined/SSR.parquet/part-00000-4659fc3c-6a63-4ab1-bcfe-620d052a165a-c000.snappy.parquet b/src/test/resources/test-joined/SSR.parquet/part-00000-4659fc3c-6a63-4ab1-bcfe-620d052a165a-c000.snappy.parquet new file mode 100644 index 00000000..9a01ee22 Binary files /dev/null and b/src/test/resources/test-joined/SSR.parquet/part-00000-4659fc3c-6a63-4ab1-bcfe-620d052a165a-c000.snappy.parquet differ diff --git a/src/test/resources/value_tables/IR_NAT_V.csv b/src/test/resources/value_tables/IR_NAT_V.csv new file mode 100644 index 00000000..106b3079 --- /dev/null +++ b/src/test/resources/value_tables/IR_NAT_V.csv @@ -0,0 +1,1256 @@ +PRS_NAT;PRS_NAT_STA;PRS_NAT_CB2;PRS_NAT_LIB;NAT_SOI_TOP;TDM_NAT_AFF;PRS_NAT_TDM;NAT_CPL_ACT;CCA_CPL_COD;PRN_AID_TOP;PRS_SPP_TOP;TOP_SNI;TYA_COD;CEL_TYP;TOP_PRS_SOI;PRS_NAT_RGT;TOP_TRF_OPP;PRS_NAT_OBS;TEC_COL +5204;3;FDO;FORFAIT ORTHODONTIE;0;0;6;0;;1;;1;Z;21;0;27;;;0 +5205;3;FPC;FORFAIT PROTHESE CONJOINTE (CMU HORS PANIER DE SOINS);1;0;2;0;;1;;1;Z;20;0;27;;;0 +5206;3;FPO;FORFAIT ORTHODONTIQUE (CMU HORS PANIER DE SOINS);1;0;2;0;;1;;1;Z;20;0;27;;;0 +5301;3;FTI;FORFAIT TIPS;0;0;6;0;;0;;1;Z;;0;6;;fermé le 01/01/2000;0 +5401;3;PAU;FORFAIT AUDIO-PROTHESE;0;0;6;0;;0;;1;Z;21;0;40;;;0 +6011;4;CAR-;CARENCE -;0;0;13;0;;0;;0;Z;;0;13;;;0 +6012;4;CAR +;CARENCE +;0;0;13;0;;0;;0;Z;;0;13;;;0 +6013;4;IJC;IJ CARENCE (CRPCEN);0;0;13;0;;0;;0;Z;23;0;13;;;0 +6014;4;CIJ;COMPLEMENT IJ >PLAFOND (CRPCEN);0;0;13;0;;0;;1;Z;23;0;13;;;0 +6110;99;ASN;IJ NORMALES + 6MOIS;1;9;99;0;;0;;1;Z;23;0;13;;fermé le 30/06/2008;0 +6111;4;NOR-/REN-/MIN-;IJ NORMALES -3 MOIS;1;0;13;0;;0;;1;Z;;0;13;;;0 +6112;4;NOR+/MIN+;IJ NORMALES +3MOIS;1;0;13;0;;0;;1;Z;;0;13;;;0 +6113;4;1/1-;IJ REDUITES -3MOIS;1;0;13;0;;0;;1;Z;;0;13;;;0 +6114;4;1/1+;IJ REDUITES +3MOIS;1;0;13;0;;0;;1;Z;;0;13;;;0 +6115;4;MAJ-/REN+;IJ MAJOREES -3MOIS;1;0;13;0;;0;;1;Z;;0;13;;;0 +6116;4;MAJ+/MIJ-/MIJ+;IJ MAJOREES +3MOIS;1;0;13;0;;0;;1;Z;;0;13;;;0 +6117;4;MIT-;IJ PARTIELLE, PERTE DE SALAIRE -3MOIS;1;0;13;0;;0;;1;Z;;0;13;;;0 +6118;4;MIT+;IJ PARTIELLE, PERTE DE SALAIRE +3MOIS;1;0;13;0;;0;;1;Z;;0;13;;;0 +6119;99;ASM;IJ MAJOREES + 6 MOIS;1;9;99;0;;0;;1;Z;23;0;13;;fermé le 30/06/2008;0 +6120;4;ITI;INDEMNITE TEMPORAIRE D'INAPTITUDE AT/MP;1;0;13;0;;0;;0;Z;23;0;13;;;0 +6121;4;PRE;IJ PRENATALES;0;0;13;0;;0;;0;Z;23;0;13;;;0 +6122;4;POS;IJ POSTNATALES;0;0;13;0;;0;;0;Z;23;0;13;;;0 +6123;4;ADO;IJ EN CAS D ADOPTION;0;0;13;0;;0;;0;Z;23;0;13;;;0 +6124;4;ISM;IJ CONGE SUPPLEMENTAIRE PREMA;0;0;13;0;;0;;0;Z;23;0;13;;;0 +6125;99;IRT+/IRT-;INDEMNITE REMPLACEMENT CONJOINTS COLLABORATEURS TI;0;9;99;0;;0;;0;Z;23;0;13;;;0 +6126;99;FGP;FORFAIT GROSSESSE TAUX PLEIN TI;0;9;99;0;;0;;0;Z;21;0;13;;;0 +6127;99;FGR;FORFAIT GROSSESSE TAUX REDUIT TI;0;9;99;0;;0;;0;Z;21;0;13;;;0 +6128;99;FAP;FORFAIT ADOPTION TAUX PLEIN TI;0;9;99;0;;0;;0;Z;21;0;13;;;0 +6129;99;FAR;FORFAIT ADOPTION TAUX REDUIT TI;0;9;99;0;;0;;0;Z;21;0;13;;;0 +6131;4;CUN-;IJ NORMALES POUR CURE THERMALE;1;0;13;0;;0;;1;Z;;0;24;;;0 +6132;4;CUM-;IJ MAJOREES POUR CURE THERMALE;1;0;13;0;;0;;1;Z;;0;24;;;0 +6133;4;CUR-;IJ REDUITES POUR CURE THERMALE;1;0;13;0;;0;;1;Z;;0;24;;;0 +6134;99;AAM-;IJ MALADIE PAMC MOINS DE 3 MOIS;0;9;99;0;;0;;0;Z;23;0;0;;;0 +6135;99;AAM+;IJ MALADIE PAMC PLUS DE 3 MOIS;0;9;99;0;;0;;0;Z;23;0;0;;;0 +6191;99;IDN;INDEMNITE DE NOURRITURE;0;9;99;0;;0;;0;Z;23;0;13;;;0 +6211;99;AFE+/AFE-;ALLOCATION FEMME ENCEINTE;0;9;99;0;;0;;0;Z;23;0;0;;;0 +6212;4;PER;IJ CONGE MATERNITE AU PERE;0;0;13;0;;0;;0;Z;23;0;13;;;0 +6213;4;;INDEMNITE DE REMPLACEMENT PATERNITE;0;0;13;0;;0;;0;Z;;0;13;;Pas d information sur la prestation;0 +6214;99;ISP-;IJ CONGE SUPPLEMENTAIRE MATERNITE PAMC;0;9;99;0;;0;;0;Z;23;0;0;;;0 +6215;99;IRN+/IRN-;IJ CONGE POSTNATAL PAMC;0;9;99;0;;0;;0;Z;23;0;0;;;0 +6221;4;ARM;ALLOCATION REPOS MATERNEL NORMAL;0;0;13;0;;0;;0;Z;21;0;13;;;0 +6222;4;ARA;ALLOCATION REPOS MATERNEL ADOPTION;0;0;13;0;;0;;0;Z;21;0;13;;;0 +6231;99;IRM+/IRM-;IJ CONGE PRENATAL PAMC;0;9;99;0;;0;;0;Z;23;0;13;;;0 +6232;99;IRA-/IRA+;IJ CONGE ADOPTION PAMC;0;9;99;0;;0;;0;Z;23;0;13;;;0 +6233;99;IRP-;IJ CONGE PATHOLOGIQUE PAMC;0;9;99;0;;0;;0;Z;23;0;13;;;0 +6234;4;IRG;INDEMNITE MATERNITE EN CAS DE NAISSANCES MULTIPLES;0;0;13;0;;0;;0;Z;21;0;13;;;0 +6235;4;IRC;INDEMNITE DE REMPLACEMENT CONJOINTE COLLABORATRICE;0;0;13;0;;0;;0;Z;21;0;13;;;0 +6236;99;IPA-;IJ PATERNITE PAMC;0;9;99;0;;0;;0;Z;23;0;13;;;0 +6237;4;IPC;INDEMNITE PATERNITE CONJOINT PAMC;0;0;13;0;;0;;0;Z;21;0;13;;;0 +6238;4;IPI;INDEMNITES PATERNITE CONJOINT INFIRMIER;0;0;13;0;;0;;0;Z;21;0;13;;;0 +6239;99;IDA;INDEMNITÉ MALADIE DOUBLE ACTIVITÉ PAMC;1;9;99;0;;0;;0;Z;21;0;0;;;0 +6241;4;AVP;ALLOCATION ACCOMPAGNEMENT FIN DE VIE CESSATION ACTIVITE TEMPS PLEIN;0;0;13;0;;0;;0;Z;23;0;13;;;0 +6242;4;AVR;ALLOCATION ACCOMPAGNEMENT FIN DE VIE CESSATION ACTIVITE REDUITE;0;0;13;0;;0;;0;Z;23;0;13;;;0 +6251;4;NNO-;ALLOCATION NUIT NORMALE - 3 MOIS;0;0;13;0;;0;;0;Z;;0;13;;;0 +6252;4;NNO+;ALLOCATION NUIT NORMALE + 3 MOIS;0;0;13;0;;0;;0;Z;;0;13;;;0 +6253;4;NME-;ALLOCATION NUIT MAJOREE 3 ENFANTS - 3 MOIS;0;0;13;0;;0;;0;Z;;0;13;;;0 +6262;4;EME+;ALLOCATION EXPOSITION MAJOREE 3 ENFANTS + 3 MOIS;0;0;13;0;;0;;0;Z;;0;13;;;0 +6263;4;EMN;ALLOCATION EXPOSITION ARRET + 6 MOIS;0;0;13;0;;0;;0;Z;23;0;13;;fermé le 30/12/2006;0 +6264;4;EEN;ALLOCATION EXPOSITION ARRET + 6 MOIS ET 3 ENFANTS;0;0;13;0;;0;;0;Z;23;0;13;;fermé le 30/12/2006;0 +6311;4;;RDS IJ MATERNITE;0;0;13;0;;0;;0;Z;;0;0;;Saisie manuelle Qualiflux;0 +6312;4;;RDS ALLOC MATERNITE;0;0;13;0;;0;;0;Z;;0;0;;Saisie manuelle Qualiflux;0 +7111;8;PI/RPI;PENSION INVALIDITE AVANTAGES DE BASE;0;0;15;0;;0;;0;Z;;0;0;;;0 +7112;8;FN/RFN;PENSION INVALIDITE ALLOCATIONS SUPPLEMENTAIRES;0;0;15;0;;0;;0;Z;;0;0;;;0 +7113;8;TP/RTP;PENSION INVALIDITE MAJORATIONS POUR ASSISTANCE D UNE TIERCE PERSONNE;0;0;15;0;;0;;0;Z;;0;0;;;0 +7119;8;;PENSIONS D INVALIDITE SERVIES PAR LE REGIME SPECIAL DE SECURITE SOCIALE DANS LES MINES;0;0;15;0;;0;;0;Z;;0;0;;Pas d information sur la prestation;0 +8111;10;RVI;RENTES DE VICTIME;0;0;17;0;;0;;0;Z;18;0;0;;;0 +8112;10;MTP;MAJORATIONS POUR ASSISTANCE D UNE TIERCE PERSONNE;0;0;17;0;;0;;0;Z;18;0;0;;;0 +8113;10;RCS;RENTES DE CONJOINT SURVIVANT;0;0;17;0;;0;;0;Z;18;0;0;;;0 +8114;10;REV;RENTES DE REVERSION;0;0;17;0;;0;;0;Z;18;0;0;;;0 +8115;10;ROR;RENTES D ORPHELIN;0;0;17;0;;0;;0;Z;18;0;0;;;0 +8116;10;RAS;RENTES D ASCENDANT;0;0;17;0;;0;;0;Z;18;0;0;;;0 +8117;10;FIA;MAJORATION FAUTE INEXCUSABLE RENTE ASCENDANT;0;0;17;1;25;0;;0;Z;18;0;0;;;0 +8118;10;FIR;MAJORATION FAUTE INEXCUSABLE RENTE VICTIME;0;0;17;1;25;0;;0;Z;18;0;0;;;0 +8119;10;FIC;MAJORATION FAUTE INEXCUSABLE RENTE CONJOINT;0;0;17;1;25;0;;0;Z;18;0;0;;;0 +8120;10;FIO;MAJORATION FAUTE INEXCUSABLE RENTE ORPHELIN;0;0;17;1;25;0;;0;Z;18;0;0;;;0 +8121;99;PTP;PRESTATION COMPLEMENTAIRE POUR RECOURS A TIERCE PERSONNE;0;9;99;0;;0;;0;Z;18;0;0;;;0 +8221;10;MCR;MAJORATIONS CRISTALLISEES;0;0;17;0;;0;;0;Z;18;0;0;;fermé le 31/12/2006;0 +8222;10;ROB;RACHAT OBLIGATOIRE;0;0;17;0;;0;;0;Z;18;0;0;;;0 +8223;10;RFT;RACHAT FACULTATIF TOTAL;0;0;17;0;;0;;0;Z;18;0;0;;;0 +8224;10;RFP;RACHAT FACULTATIF PARTIEL;0;0;17;0;;0;;0;Z;18;0;0;;;0 +8225;10;;TRANSFERT DE CAPITAUX;0;0;17;0;;0;;0;Z;;0;0;;Pas d information sur la prestation;0 +8226;10;ICA;INDEMNITE EN CAPITAL ACCIDENT DU TRAVAIL;0;0;17;0;;0;;0;Z;18;0;0;;;0 +8227;10;FII;MAJORATION FAUTE INEXCUSABLE INDEMNITE EN CAPITAL;0;0;17;1;25;0;;0;Z;18;0;0;;;0 +9111;12;P01;COMPLEMENT TICKET MODERATEUR;0;0;19;0;;0;;1;Z;21;0;39;;;0 +9112;12;P02;COMPLEMENT FRAIS DE TRANSPORT ET DE SEJOUR;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9113;12;P03;FRAIS OCCASIONNE PAR LE DON D ORGANES;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9114;12;P04;COMPLEMENT AUX FRAIS D HOSPITALISATION DE LA MERE QUI ALLAITE SON ENFANT;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9115;12;P05;PERTE DE SALAIRE POUR ENFANT MALADE;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9116;12;P06;INDEMNITES JOURNALIERES MATERNITE POUR CERTAINES CATEGORIES D ASSURES;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9118;12;P08;FRAIS FUNERAIRES;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9119;12;P10;COMPLEMENT POUR CURE THERMALE;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9121;12;P11;COMPLEMENT TICKET MODERATEUR POUR ENFANT DE MOINS D UN AN;0;0;19;0;;0;;1;Z;21;0;39;;;0 +9122;12;P12;ALLOCATION DECES;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9123;12;P13;COMPLEMENT MALADIES CHRONIQUES ET MAINTIEN A DOMICILE;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9129;12;P09;INDEMNITES COMPLEMENTAIRES EN REEDUCATION PROFESSIONNELLE;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9131;12;PSD/IPS;INDEMNITE DE PERTE DE SALAIRE Y COMPRIS DIALYSE A DOMICILE;0;0;19;0;;0;;0;Z;;0;0;;;0 +9132;12;TTH;FRAIS DE DEPLACEMENT EN CURE THERMALE;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9133;12;HTH;FRAIS D HEBERGEMENT EN CURE THERMALE;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9134;12;PFR;PRIME DE FIN DE REEDUCATION PROFESSIONNELLE;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9135;12;SEC;AIDE FINANCIERE EXCEPTIONNELLE (SECOURS);0;0;19;0;;0;;0;Z;;0;0;;fermé le 20/02/2004;0 +9141;12;VA;MAJORATION LIEE A UNE VISITE D URGENCE;0;0;19;0;;0;;1;Z;20;0;49;;fermé le 15/09/2014;0 +9142;12;KAU;MAJORATION LIEE A UN K D URGENCE;0;0;19;0;;0;;1;Z;;0;49;;;0 +9143;12;;SUPPLEMENT POUR SOINS AUX POLYTRAUMATISES;0;0;19;0;;0;;1;Z;;0;49;;;0 +9144;12;;HONORAIRES REMUNERANT LA PERMANENCE TELEPHONIQUE SUR LA BASE DE 3 C DE L HEURE (ASTREINTE);0;0;19;0;;0;;1;Z;;0;47;;Saisie manuelle Qualiflux;0 +9151;99;;PLAN SEGUIN;0;9;99;0;;0;;1;Z;;0;22;;fermé le 20/05/2011;0 +9152;99;;PLAN EVIN;0;9;99;0;;0;;1;Z;;0;22;;fermé le 07/07/2011;0 +9161;12;;PRESTATIONS SUPPLEMENTAIRES ALSACE-MOSELLE;0;0;19;0;;0;;0;Z;;0;0;;Saisie manuelle Qualiflux;0 +9162;12;SNM;SURVEILLANCE MEDICALE MATERNELLE EN ACTION SANITAIRE ET SOCIALE;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9163;12;;CENTRE EXAMEN SANTE SAISIE MANUELLE;0;0;19;0;;0;;0;Z;;0;0;;Saisie manuelle Qualiflux;0 +9164;12;;PRESTATIONS D ASS SNCF ET REGIME GENERAL;0;0;19;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9165;12;;PRESTATIONS D ASS SNCF ET REGIME GENERAL;0;0;19;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9166;12;;PRESTATIONS D ASS SNCF ET REGIME GENERAL;0;0;19;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9167;12;;PRESTATIONS D ASS SNCF ET REGIME GENERAL;0;0;19;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9168;12;;PRESTATIONS D ASS SNCF ET REGIME GENERAL;0;0;19;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9169;12;;PRESTATIONS D ASS SNCF ET REGIME GENERAL;0;0;19;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9170;12;;PRESTATIONS D ASS SNCF ET REGIME GENERAL;0;0;19;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9191;12;;TICKET MODERATEUR - PREVENTION BUCCO-DENTAIRE;0;0;19;0;;0;;1;Z;;0;22;;;0 +9201;99;EEP;ENTRETIEN EVALUATION PSYCHOLOGUE;0;9;99;0;;0;;1;C;20;0;1;;;0 +9202;99;APS;ACCOMPAGNEMENT PSYCHOLOGIQUE DE SOUTIEN;0;9;99;0;;0;;1;C;20;0;1;;;0 +9203;99;PSS;PSYCHOTHERAPIE STRUCTUREE;0;9;99;0;;0;;1;C;20;0;1;;;0 +9211;12;VCC/VAC;VACCIN (MILITAIRES) / VACCIN GRIPPE (CRPCEN);0;1;19;0;;0;;1;Z;21;0;5;;;0 +9221;13;TDF;TEST DE DEPISTAGE RAPIDE (FOURNISSEUR);0;0;20;0;;0;;0;Z;21;0;0;;;0 +9311;13;BD4;DEPISTAGE DU CANCER COLO-RECTAL;0;0;20;0;;0;;1;Z;21;0;5;;;0 +9312;13;;DEPISTAGE DU CANCER DU SEIN (PREVENTION MALADIE);0;0;20;0;;0;;1;Z;;0;20;;Saisie manuelle Qualiflux;0 +9313;13;BD3;FRAIS DE LECTURE POUR MAMMOGRAPHIE;0;0;20;0;;0;;0;Z;21;0;0;;;0 +9318;13;;ANALYSES DEPISTAGE CANCER UTERUS (PREVENTION MALADIE);0;0;20;0;;0;;1;Z;;0;20;;Saisie manuelle Qualiflux;0 +9319;13;;PRELEVEMENTS DEPISTAGE CANCER UTERUS (PREVENTION MALADIE);0;0;20;0;;0;;1;Z;;0;20;;Saisie manuelle Qualiflux;0 +9411;13;;CONSULTATION HYGIENE BUCCO-DENTAIRE;0;0;20;0;;0;;1;Z;;0;20;;Saisie manuelle Qualiflux;0 +9412;13;;HYGIENE BUCCO-DENTAIRE N91 (SCELLEMENT D UNE MOLAIRE);0;0;20;0;;0;;1;Z;;0;20;;Saisie manuelle Qualiflux;0 +6254;4;NME+;ALLOCATION NUIT MAJOREE 3 ENFANTS + 3 MOIS;0;0;13;0;;0;;0;Z;;0;13;;;0 +6255;4;NMN;ALLOCATION NUIT MAJOREE ARRET + 6 MOIS;0;0;13;0;;0;;0;Z;23;0;13;;fermé le 30/12/2006;0 +6256;4;NEN;ALLOCATION NUIT MAJOREE ARRET + 6 MOIS ET 3 ENFANTS;0;0;13;0;;0;;0;Z;23;0;13;;fermé le 30/12/2006;0 +6257;4;ENO-;ALLOCATION EXPOSITION NORMALE - 3 MOIS;0;0;13;0;;0;;0;Z;;0;13;;;0 +6258;4;ENO+;ALLOCATION EXPOSITION NORMALE + 3 MOIS;0;0;13;0;;0;;0;Z;;0;13;;;0 +6261;4;EME-;ALLOCATION EXPOSITION MAJOREE 3 ENFANTS - 3 MOIS;0;0;13;0;;0;;0;Z;;0;13;;;0 +9704;12;ASP;COMPLEMENT D ACTION SOCIALE PROTHESES DENTAIRES (CLERCS ET EMPLOYES DE NOTAIRES, PORT AUTONOME DE BORDEAUX);0;0;19;0;;0;;1;Z;20;0;38;;;0 +9705;12;SSP;COMPLEMENT D ACTION SOCIALE DENTAIRE ALSACE MOSELLE (CLERCS ET EMPLOYES DE NOTAIRES);0;0;19;0;;0;;0;Z;21;0;0;;;0 +9706;12;SOL;COMPLEMENT D ACTION SOCIALE OPTIQUE (CLERCS ET EMPLOYES DE NOTAIRES);0;0;19;0;;0;;1;Z;21;0;19;;;0 +9707;12;SOE;COMPLEMENT D ACTION SOCIALE OPTIQUE, ENFANT DE MOINS DE 16 ANS (CLERCS ET EMPLOYES DE NOTAIRES);0;0;19;0;;0;;1;Z;21;0;19;;;0 +9708;12;SOM;COMPLEMENT D ACTION SOCIALE MONTURES (CLERCS ET EMPLOYES DE NOTAIRES, PORT AUTONOME DE BORDEAUX);0;0;19;0;;0;;1;Z;21;0;19;;;0 +9709;12;SOV;COMPLEMENT D ACTION SOCIALE VERRES (CLERCS ET EMPLOYES DE NOTAIRES, PORT AUTONOME DE BORDEAUX);0;0;19;0;;0;;1;Z;21;0;19;;;0 +9710;12;;COMPLEMENT D ACTION SOCIALE LENTILLES (CLERCS ET EMPLOYES DE NOTAIRES, PORT AUTONOME DE BORDEAUX);0;0;19;0;;0;;1;Z;;0;19;;Autre regime NEC;0 +9711;12;LYA/LYB/LYJ/LYT;PRIME DE LAYETTE (CLERCS ET EMPLOYES DE NOTAIRES);0;0;19;0;;0;;0;Z;;0;0;;;0 +9712;12;VOY;ALLOCATION VOYAGE DES ENFANTS A LA COLONIE DU PRARIAND (CLERCS ET EMPLOYES DE NOTAIRES);0;0;19;0;;0;;0;Z;21;0;0;;fermé le 31/12/2008;0 +9713;12;COL;AIDE AUX SEJOURS DES ENFANTS (CLERCS ET EMPLOYES DE NOTAIRES);0;0;19;0;;0;;0;Z;21;0;0;;;0 +9714;12;FAM;ALLOCATION VACANCES (CLERCS ET EMPLOYES DE NOTAIRES);0;0;19;0;;0;;0;Z;21;0;0;;;0 +9715;12;SSO;AIDES FINANCIERES INDIVIDUELLES OPTIQUE;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9716;12;SSD;ACTION SOCIALE DENTAIRE;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9717;12;SSA;ACTION SOCIALE AUDITIF;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9718;12;CCT;ACTION SOCIALE CURE THERMALE;0;0;19;0;;0;;0;Z;21;0;0;;fermé le 01/01/2000;0 +9719;12;SEJ;AIDES FINANCIERES INDIVIDUELLES HOSPITALISATION;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9721;12;DIF;AIDES FINANCIERES INDIVIDUELLES DIFFICULTES FINANCIERES;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9722;12;AAF;AIDES FINANCIERES INDIVIDUELLES AUTRES;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9723;12;RET;VACANCES RETRAITES (CLERCS ET EMPLOYES DE NOTAIRES);0;0;19;0;;0;;0;Z;;0;0;;fermé le 31/12/2008;0 +9724;12;SCO;AIDE A LA SCOLARITE (CRPCEN);0;0;19;0;;0;;0;Z;21;0;0;;;0 +9725;12;CES;ACCUEIL JEUNE ENFANT (CRPCEN);0;0;19;0;;0;;0;Z;21;0;0;;;0 +9726;99;SOP;PRESTATIONS SUPPLÉMENTAIRES OPTIQUES CRPCEN;0;9;99;0;;0;;0;Z;21;0;0;;;0 +9727;99;SAU;PRESTATIONS SUPPLÉMENTAIRES ACOUSTIQUES CRPCEN;0;9;99;0;;0;;0;Z;21;0;0;;;0 +9731;12;AMU;COMPLEMENT A L'ACS;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9732;12;AFI;AIDE A L'ACQUISITION D'UNE COUVERTURE COMPLEMENTAIRE POUR LES VICTIMES DU SEUIL ACS;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9741;12;HLO;AIDES AU LOGEMENT;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9742;12;HCO;AIDE A LA COMMUNICATION HORS AUDITIF;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9840;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9841;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9842;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9843;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9844;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9845;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9846;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9847;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9848;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9849;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9850;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9851;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9852;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9853;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9854;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9855;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9856;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9857;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9858;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9859;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9860;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9861;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9862;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9863;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9864;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9865;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9866;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9867;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9868;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9869;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9870;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9871;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9872;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9873;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9874;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9769;12;DIN;AIDES FINANCIERES A CARACTERE SOCIAL PALLIANT L'ABSENCE DE REVENUS;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9771;12;HPA;AIDES PROTHESES AUDITIVES;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9772;12;HAI;AIDES MENAGERES;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9773;12;ASH;AIDES MENAGERES SORTIE D'HOSPITALISATION COORDONNEE;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9774;12;PSH;ACTES FOURNITURES SORTIE D'HOSPITALISATION COORDONNEE;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9775;12;PAD;AIDES MENAGERES PROGRAMME D'ACCOMPAGNEMENT APRES INTERVENTION ORTHOPEDIQUE;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9776;12;PFD;ACTES FOURNITURES PROGRAMME D'ACCOMPAGNEMENT APRES INTERVENTION ORTHOPEDIQUE;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9777;12;PPL;PHARMACIE NON REMBOURSABLE SOINS PALLIATIFS;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9778;12;BCP;BILANS DE COMPETENCES ET REORIENTATION PROFESSIONNELLE;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9801;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9802;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9803;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9804;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +1407;99;CT1;COURONNE TRANSITOIRE RAC MODERE;1;9;99;0;;1;;1;T;20;1;42;;;0 +1408;99;CZ1;COURONNE ZIRCONE RAC MODERE;1;9;99;0;;1;;1;T;20;1;42;;;0 +1409;99;IC1;INLAY CORE RAC MODERE;1;9;99;0;;1;;1;T;20;1;42;;;0 +1410;99;IN1;INLAY ONLAY RAC MOERE;1;9;99;0;;1;;1;T;20;1;42;;;0 +1412;99;PA1;PROTHESE AMOVIBLE RAC MODERE;1;9;99;0;;1;;1;T;20;1;42;;;0 +1413;99;RE1;REPARATION PROTHESE RAC MODERE;1;9;99;0;;1;;1;T;20;1;42;;;0 +1414;99;SU1;SUPPLEMENT PROTHESE METALLIQUE RAC MODERE;1;9;99;0;;1;;1;T;20;1;42;;;0 +9875;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9876;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9877;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9878;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9879;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9880;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9881;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9882;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9883;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9884;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9885;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9886;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9887;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9888;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9889;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9890;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9891;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9892;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9893;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9894;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9895;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9896;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9897;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9898;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9899;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9901;12;TRA;TRANSPORT POUR PERSONNE ACCOMPAGNANTE (MILITAIRES);0;0;19;0;;0;;0;Z;21;0;0;;;0 +9902;12;AFA;AIDE SOCIALE (MILITAIRES);0;0;19;0;;0;;0;Z;21;0;0;;;0 +9911;12;AIM;AIDE MENAGERE;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9912;12;AFM;AIDE MENAGERE FAMILIALE (MILITAIRES);0;0;19;0;;0;;0;Z;21;0;0;;;0 +9999;99;;VALEUR INCONNUE;9;9;99;0;;0;;0;I;;0;0;;;1 +1195;99;A51;FOND INNOVATION - PAIEMENT;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1404;99;RA0;REPARATION ADJONCTION RAC 0;1;9;99;0;;1;;1;T;20;1;42;;;0 +1405;99;RS0;REPARATION PROTHESE ADJOINTE SIMPLE RAC 0;1;9;99;0;;1;;1;T;20;1;42;;;0 +1406;99;SU0;SUPPLEMENT PROTHESE RESINE RAC 0;1;9;99;0;;1;;1;T;20;1;42;;;0 +2386;5;PY6;FORFAIT PSYCHIATRIE SEANCE COLL, 2 INTERVENANT 6 à 8H;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2387;5;PY7;FORFAIT PSYCHIATRIE SEANCE IND. 2 INTERVENANTS 6 à 8H;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2388;5;PY8;FORFAIT PSYCHIATRIE DE SECURITE HOSPITALISATION SANS HEBERGEMENT;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2389;5;PY9;PRISE EN CHARGE DE NUIT POUR UNE DUREE ENTRE 8 ET 12H;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2391;5;FR2;FORFAIT TECHNIQUE TARIF REDUIT N°2;1;1;10;0;;0;1;0;Z;;0;0;;;0 +2392;5;FR3;FORFAIT TECHNIQUE TARIF REDUIT N°3;1;1;10;0;;0;1;0;Z;;0;0;;;0 +2411;11;IG;INTERVENTION IVG;1;0;10;0;;0;;1;T;20;0;18;1;;0 +2412;11;IGA;ANESTHESIE GENERALE;1;0;10;0;;0;;1;T;20;0;18;1;;0 +2413;11;IGB;INVESTIGATIONS BIOLOGIQUES;1;0;10;0;;0;;1;Z;20;0;18;;;0 +2414;11;IC;CONSULTATION IVG;1;0;10;0;;0;;1;C;20;0;18;1;;0 +2415;11;IGM;MEDICAMENTS: MIFEYGINE;1;0;10;0;;0;;1;Z;21;0;5;;;0 +2416;11;IGP;MEDICAMENTS: PROSTAGLANDINES;1;0;10;0;;0;;1;Z;21;0;5;;;0 +2417;11;IVB;VERIFICATION BIOLOGIQUE;1;0;10;0;;0;;1;Z;20;0;18;;;0 +2418;11;IVE;VERIFICATION ECHOGRAPHIQUE;1;0;10;0;;0;;1;Z;20;0;18;1;;0 +2419;11;IMD;FORFAIT INTERVENTION AMBULATOIRE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2420;11;IMI;FORFAIT INTERVENTION DUREE < OU = 12 H PRIVE MEDIC;0;0;10;0;;0;;1;Z;22;0;18;;;0 +2421;11;AMD;INTERVENTION + ANESTHESIE AMBULATOIRE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2422;11;AMF;FORFAIT POUR IVG MEDICAMENTEUSE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2423;11;IPD;FORFAIT INTERVENTION AVEC NUITEE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2424;11;APD;INTERVENTION + ANESTHESIE AVEC NUITEE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2425;99;FJS;FORFAIT IVG POUR 24H SUPPLEMENTAIRES - SECTEUR PRIVE/SECTEUR PUBLIC;0;9;99;0;;0;;0;Z;22;0;0;;fermé le 30/03/2013;0 +2426;11;ICS;CONSULTATION IVG SPECIALISTE;1;0;10;0;;0;;1;C;20;0;18;;;0 +2428;99;IPE;ECHO PRE IVG;1;9;10;0;;0;;1;Z;20;0;18;;;0 +2501;5;FI1;FORFAIT FIR PDSES PUBLIQUE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2502;5;FI2;FORFAIT FIR CENTRE DEPISTAGE ANONYME ET GRATUIT;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2503;5;FI3;FORFAIT FIR CENTRE PERINATAUX DE PROXIMITE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2504;5;FI4;FORFAIT FIR EDUCATION THERAPEUTIQUE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +3101;99;FSA;FORFAIT STRUCTURE PS AUXILIAIRES;1;9;99;0;;0;;1;Z;21;0;47;;;0 +3102;99;FSB;CONTESTATION FORFAIT STRUCTURE PS AUXILIAIRES;1;9;99;0;;0;;1;Z;21;0;47;;;0 +3110;99;CII;CONTRAT INCITATIF INFIRMIER;1;9;99;0;;0;;1;Z;20;0;47;;;0 +3111;2;AMI;ACTES EN AMI;1;1;;0;;1;1;1;Z;20;1;32;;;0 +3112;2;AIS;ACTES INFIRMIERS DE SOINS (AMI3-AMI13,AMI16);1;0;3;0;;1;1;1;Z;20;1;3;;;0 +3113;2;SFI;ACTES INFIRMIERS DES SAGES-FEMMES (SFI);1;1;;0;;1;1;1;Z;20;1;32;;;0 +3114;2;PSI;PLAN DE SOINS INFIRMIER;1;0;3;0;;0;;0;Z;;1;0;;fermé le 01/10/1998;0 +3115;2;DI;DEMARCHE INFIRMIER;1;0;3;0;;1;;1;Z;20;1;3;;;0 +3116;2;MAU;MAJORATION POUR ACTE UNIQUE;1;0;3;1;32;0;1;1;Z;20;0;70;;;0 +3117;2;MCI;MAJORATION DE COORDINATION ET D'ENVIRONNEMENT DE SOIN INFIRMIER;1;0;3;1;32;0;1;1;Z;20;0;70;;;0 +3118;2;VGI;REMUNERATION VACCINATION GRIPPE A PAR INFIRMIERS LIBERAUX;1;0;3;0;;0;;1;Z;20;0;47;;;0 +3119;2;VIR;REMUNERATION VACCINATION GRIPPE A PAR INFIRMIER RETRAITE OU SALARIE HORS OBLIGATIONS;1;0;3;0;;0;;0;Z;;0;0;;;0 +3121;2;AMC;ACTES AMC;1;0;3;0;;1;1;1;Z;20;1;3;;;0 +3122;2;AMK;ACTES EN AMK;1;0;3;0;;1;1;1;Z;20;1;3;;;0 +3124;2;AMB;BILAN DE KINESITHERAPIE;1;0;3;0;;0;;0;Z;;1;0;;fermé le 01/10/1998;0 +3125;2;AMS;ACTES DE KINESITHERAPIE OSTEO-ARTICULAIRE;1;0;3;0;;1;1;1;Z;20;1;3;;;0 +3126;99;CIK;CONTRAT DEMOGRAPHIQUE KINE;1;9;99;0;;0;;1;Z;20;0;47;;;0 +3127;99;FRD;FORFAIT PRISE EN CHARGE AVC;1;9;99;0;;1;;1;C;20;0;46;1;;0 +3128;99;FAD;RETOUR A DOMICILE POST CHIRURGIE ORTHOPEDIQUE;1;9;99;0;;1;;1;C;20;0;46;1;;0 +3131;2;AMP;ACTES DES PEDICURES;1;0;3;0;;1;1;1;Z;20;1;3;;;0 +3132;2;AMO;ACTES DES ORTHOPHONISTES;1;0;3;0;;1;;1;Z;20;1;3;;;0 +3133;2;AMY;ACTES DES ORTHOPTISTES;1;0;3;0;;1;;1;Z;20;1;3;;;0 +3134;1;POD;ACTE DE PEDICURE-PODOLOGUE (DIABETIQUE);1;0;3;0;;1;1;1;Z;20;0;3;;;0 +3135;99;FOT;FORFAIT EVALUATION DOMICILE HANDICAP;1;9;99;1;32;0;;1;Z;20;0;70;;;0 +3139;99;CIO;CONTRAT INCITATIF ORTHOPHONISTE;1;9;99;0;;0;;1;Z;20;0;47;;;0 +3211;2;B;ACTES DE BIOLOGIE;1;1;;0;;1;1;1;Z;20;0;29;;;0 +3212;2;BP;ACTES D ANATOMO-CYTO-PATHOLOGIE EN LABORATOIRE;1;0;4;0;;0;;1;Z;;0;29;;fermé le 31/12/2001;0 +3213;99;FPB;FORFAIT PREALABLE BIOLOGIE IVG VILLE;1;9;99;0;;0;;1;Z;20;0;18;0;;0 +3214;2;BR;ACTES EN BR;1;0;4;0;;0;;1;Z;;0;29;;fermé le 31/12/2001;0 +3215;2;ADU;ANALYSE DEPISTAGE CANCER DE L'UTERUS;1;0;4;0;;0;;1;Z;;0;4;;fermé le 01/01/1999;0 +3216;99;FUB;FORFAIT ULTERIEUR BIOLOGIE IVG VILLE;1;9;99;0;;0;;1;Z;20;0;18;0;;0 +3221;2;KB;PRELEVEMENT AUTRE QUE SANGUIN PAR UN DIRECTEUR DE LABORATOIRE;1;0;4;0;;1;;1;Z;20;0;4;;;0 +3222;2;PB;PRELEVEMENT SANGUIN PAR UN DIRECTEUR DE LABORATOIRE;1;1;;0;;1;1;1;Z;20;0;4;;;0 +3223;2;TB;PRELEVEMENT SANGUIN PAR UN TECHNICIEN DE LABORATOIRE;1;1;;0;;1;1;1;Z;20;0;4;;;0 +3224;2;KDU;PRELEVEMENT DEPISTAGE CANCER DE L'UTERUS;1;0;4;0;;0;;1;Z;;0;20;;fermé le 01/01/1999;0 +3225;2;KMB;PRELEVEMENT PAR PONCTION VEINEUSE DIRECTE POUR UN MEDECIN BIOLOGISTE;1;0;4;0;;1;1;1;T;20;0;4;;;0 +3300;99;RRA;REMUNERATION ROSP AOD;1;9;99;0;;0;;0;Z;21;0;0;;;0 +3301;99;CRA;CONTESTATION ROSP AOD;1;9;99;0;;0;;0;Z;21;0;0;;;0 +3302;99;PGC;PAIEMENT GARANTIE CONVENTIONNELLE PHARMACIE;1;9;99;0;;0;;0;Z;21;0;0;;;0 +3303;99;CGC;CONTESTATION GARANTIE CONVENTIONNELLE PHAMACIE;1;9;99;0;;0;;0;Z;21;0;0;;;0 +3304;99;PQS;PAIEMENT QUALITE DE SERVICE PHARMACIE;1;9;99;0;;0;;0;Z;21;0;0;;;0 +3305;99;CQS;CONTESTATION QUALITE DE SERVICE PHARMACIE;1;9;99;0;;0;;0;Z;21;0;0;;;0 +3306;99;BMR;BILAN MEDICATION REMUNERATION PHARMACIE;1;9;99;0;;0;;0;Z;21;0;12;;;0 +3307;99;BMC;BILAN MEDICATION CONTESTATION PHARMACIE;1;9;99;0;;0;;0;Z;21;0;12;;;0 +3311;2;PH1;PHARMACIE 100%;1;0;5;0;;1;;1;Z;21;0;28;;;0 +3312;2;PH4/PG4;PHARMACIE PH4;1;0;5;0;;1;;1;Z;;0;28;;;0 +3313;2;PH7/PG7;PHARMACIE 65%;1;1;;0;;1;;1;Z;;0;28;;PG7 : fermé le 01/10/2007;0 +2369;5;N15;FORFAIT CONSOMMABLE MEDECINE NUCLEAIRE 15;1;1;10;0;;0;1;0;Z;22;0;0;;;0 +2371;5;G1;TARIF SOINS GIR 1 ET 2;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2372;5;G2;TARIF SOINS GIR 3 ET 4;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2373;5;G3;TARIF SOINS GIR 5 ET 6;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2380;5;PY0;FORFAIT PSYCHIATRIE SEANCE COLL, 1 INTERVENANT 3 à 4H;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2381;5;PY1;FORFAIT PSYCHIATRIE SEANCE IND, 1 INTERVENANT 3 à 4H;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2382;5;PY2;FORFAIT PSYCHIATRIE SEANCE COLL, 2 INTERVENANTS 3 à 4H;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2383;5;PY3;FORFAIT PSYCHIATRIE SEANCE IND. 2 INTERVENANTS 3 à 4H;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2384;5;PY4;FORFAIT PSYCHIATRIE SEANCE COLL, 1 INTERVENANT 6 à 8H;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2385;5;PY5;FORFAIT PSYCHIATRIE SEANCE IND. 1 INTERVENANT 6 à 8H;0;0;10;0;;0;;0;Z;22;0;0;;;0 +1146;99;R4P;REMUNERATION OBJECTIF MEDECIN COMPLEMENT ET CENTRES DE SANTE;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1147;99;CFM;CONTESTATION FORFAIT MEDECIN TRAITANT;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1148;99;MPA;REMUNERATION FORFAITAIRE PAR CONSULTATION POUR LE SUIVI DES PERSONNES AGEES;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1149;99;CPA;CONTESTATION REMUNERATION SUIVI PERSONNES AGEES;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1150;99;RPT;REMUNERATION PRATICIENS TERRITORIAUX DE MEDECINE GENERALE;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1152;99;MPP;MAJORATION SUIVI DES ENFANTS GRANDS PREMATURES OU ATTEINTS DE PATHOLOGIE CONGENITALE GRAVE;1;9;99;1;22;0;;1;C;20;0;1;1;fermé le 31/12/2017;0 +1153;99;MIC;MAJORATION CONSULTATION POUR INSUFFISANT CARDIAQUE APRES HOSPITALISATION;1;9;99;1;22;0;;1;C;20;0;1;1;;0 +1154;99;MSH;MAJORATION CONSULTATION SUIVI APRES HOSPITALISATION PATIENTS A FORTE COMORBIDITE;1;9;99;1;22;0;;1;C;20;0;1;1;;0 +1155;99;COT;REMU CAS COTISATIONS SOCIALES;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1156;99;CCO;CONT. CAS COT.SOC.;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1157;99;TCP;Acte de téléconsultation;1;9;99;0;;1;;1;Z;20;0;1;;;0 +1158;99;TEP;Acte de télé expertise;1;9;99;0;;1;;1;Z;20;0;1;;;0 +1159;99;RNO;renouvellement d'optique;1;9;99;0;;1;;1;C;20;0;1;0;;0 +1160;99;IAS;INVESTISSEMENT ACTIVITE SAISONNIERE;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1161;99;CPS;COMPLEMENT PRATIQUE SAISONNIERE;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1162;99;RCM;REMUNERATION DES PRATICIENS TERRITORIAUX DE MEDECINE AMBULATOIRE;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1163;99;KIT;REMUNERATION DEPISTAGE DU CANCER COLORECTAL;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1164;99;TLC;TÉLÉ CONSULTATION - ALD ET / OU EHPAD;1;9;99;0;;1;;1;C;20;0;1;;;0 +1165;99;TLE;TÉLÉ EXPERTISE - ALD ET/OU EHPAD;1;9;99;0;;0;;1;Z;20;0;1;;;0 +1166;99;TEC;FORFAIT COMPLÉMENTAIRE TÉLÉ EXPERTISE;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1167;99;NRD;VERSEMENT DE PENALITE DE RETARD AMO;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1168;99;CCP;CONSULTATION DE CONTRACEPTION ET PREVENTION;1;9;99;0;;1;;1;C;20;0;1;1;;0 +1169;99;RCD;REMUNERATION POUR CERTIFICAT DE DECES;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1170;99;DHE;PEC EXCEPTIONNELLE DÉPASSEMENT HONORAIRE;1;9;99;0;;0;;1;Z;20;0;1;;;0 +1171;99;CSR;RÉMUNÉRATION MÉDECIN TRAITANT CENTRES DE SANTÉ;1;9;99;0;;0;;0;Z;20;0;0;;;0 +1172;99;TSA;TELESURVEILLANCE : PS EFFECTUANT L'ACCOMPAGNEMENT;1;9;99;0;;0;;1;C;20;0;1;;;0 +1173;99;MTF;FORFAIT PATIENTELE MEDECIN TRAITANT;1;9;99;0;;0;;1;Z;21;0;47;;;0 +0;0;;SANS OBJET;0;0;;0;;0;;0;Z;;0;0;;;2 +1095;99;MTJ;MAJORATION SPECIFIQUE MOINS DE 16 ANS MAYOTTE;1;9;99;1;23;0;;1;Z;20;0;1;1;;0 +1096;99;TTE;TELECONSULTATION MEDECIN TRAITANT AVEC EHPAD;1;9;99;0;;0;;1;C;20;0;1;;;0 +1097;99;TDT;TELE EXPERTISE DOSSIER TRAITANT;1;9;99;0;;0;;1;C;20;0;1;;;0 +1098;99;U03;CONSULTATION CCMU 3;1;9;99;0;;0;;1;C;20;0;1;1;;0 +1099;99;U45;CONSULTATION CCMU 4 ET 5;1;9;99;0;;0;;1;C;20;0;1;1;;0 +1100;99;RNM;PROTOCOLE MURAINE - BILAN VISUEL;1;9;99;0;;1;;1;C;20;0;1;1;;0 +1101;99;APU;AVIS PONCTUEL DE CONSULTANT PUPH;1;9;99;0;;1;1;1;C;20;0;1;1;;0 +1102;99;APY;AVIS PONCTUEL DE CONSULTANT PSYCHIATRE;1;9;99;0;;1;1;1;C;20;0;1;1;;0 +1103;99;APC;AVIS PONCTUEL DE CONSULTANT DU MEDECIN;1;9;99;0;;0;1;1;C;20;0;1;1;;0 +1104;99;COE;CONSULTATION OBLIGATOIRE ENFANT;1;9;99;0;;1;;1;C;20;0;1;1;;0 +1105;99;CCX;CONSULTATION COMPLEXE;1;9;99;0;;1;;1;C;20;0;1;1;;0 +1106;99;MCX;MAJORATION CONSULTATION COMPLEXE;1;9;99;1;22;0;;1;C;20;0;1;1;;0 +1107;99;CCE;CONSULTATION TRES COMPLEXE ENFANT;1;9;99;0;;1;;1;C;20;0;1;1;;0 +1108;99;MTX;MAJORATION CONSULTATION TRES COMPLEXE;1;9;99;1;22;0;;1;C;20;0;1;1;;0 +1109;99;GS;CONSULTATION SPECIALISTE MEDECINE GENERALE;1;9;99;0;;1;1;1;C;20;0;1;1;;0 +1110;99;G;CONSULTATION MEDECINE GENERALE;1;9;99;0;;1;1;1;C;20;0;1;1;;0 +1111;1;C;CONSULTATION COTEE C;1;1;;0;;1;1;1;C;20;0;1;1;;0 +1112;1;CS;CONSULTATION COTEE CS;1;1;;0;;1;1;1;C;20;0;1;1;;0 +1113;1;CNP;CONSULTATION COTEE CNP;1;1;;0;;1;1;1;C;20;0;1;1;;0 +1114;1;CSC;CONSULTATION SPECIFIQUE CARDIOLOGIE;1;0;1;0;;1;1;1;C;20;0;1;1;;0 +1115;1;CA;CONSULTATION BILAN;1;0;1;0;;1;;1;C;20;0;1;1;fermé le 01/07/2017;0 +1116;1;MPC;MAJORATION FORFAITAIRE TRANSITOIRE;1;1;;1;6;0;1;1;Z;20;0;1;1;;0 +1117;1;;CONSULTATION DES SPECIALISTES COTEE C2;1;1;;0;;1;1;1;C;20;0;1;1;;0 +1118;1;;CONSULTATION DES PSYCHIATRES COTEE C2,5;1;1;;0;;1;1;1;C;20;0;1;1;;0 +1119;1;MTS;MAJORATION TRANSITOIRE SPECIFIQUE;1;0;1;1;23;0;;1;Z;20;0;1;1;;0 +1120;1;CCS;COMPLEMENT CONSULTATION SPECIALISTE;1;0;1;1;26;0;1;1;C;20;0;1;;;0 +1121;1;HS;HONORAIRE DE SURVEILLANCE;1;0;1;0;;0;;1;C;20;0;1;;;0 +1122;1;EXS;EXAMEN SPECIAL (PROTOCOLE);1;0;1;0;;1;;1;C;20;0;16;1;;0 +1123;1;SES;SUITE D EXAMEN DE SANTE;1;0;1;0;;1;;1;C;20;0;1;1;;0 +1124;1;RMT;REMUNERATION MEDECIN TRAITANT PAR PATIENT EN ALD;1;0;1;0;;0;;1;Z;20;0;47;;fermé le 31/12/2017;0 +1125;1;MCG;MAJORATION DE COORDINATION DES GENERALISTES;1;1;;1;18;0;1;1;Z;20;0;1;1;;0 +1126;1;MCS;MAJORATION DE COORDINATION SPECIALISTES;1;1;;1;18;0;1;1;Z;20;0;1;1;;0 +1127;1;MCC;MAJORATION DE COORDINATION CARDIOLOGUES;1;1;;1;18;0;;1;Z;20;0;1;1;;0 +1128;1;DRT;DIFFERENTIEL REFERENT TRAITANT;1;0;1;0;;0;;1;Z;20;0;47;;;0 +1129;1;MPJ;MAJORATION FORFAITAIRE TRANSITOIRE (POUR LES MOINS DE 16 ANS);1;1;;1;6;0;1;1;Z;20;0;1;1;fermé le 31/12/2017;0 +1130;99;FMT;FORFAIT MEDECIN TRAITANT;1;9;99;0;;0;;1;Z;20;0;47;;fermé le 31/12/2017;0 +1131;1;MTA;MAJORATION CONSULTATION APPAREILLAGE;1;0;1;1;22;0;;1;C;20;0;1;1;;0 +1132;1;MCE;MAJORATION CONSULTATION ENDOCRINO;1;0;1;1;22;0;;1;C;20;0;1;1;;0 +1133;1;MGE;MAJORATION GENERALISTE ENFANT;1;0;1;1;5;0;;1;C;20;0;1;1;fermé le 31/12/2017;0 +1134;1;MPF;MAJORATION PREMIERE CONSULTATION FAMILLE;1;0;1;1;22;0;;1;C;20;0;1;1;;0 +1135;1;MAF;MAJORATION CONSULTATION ANNUELLE FAMILLE;1;0;1;1;22;0;;1;C;20;0;1;1;;0 +1136;1;MAS;MAJORATION ANNUELLE DE SYNTHESE;1;0;1;1;5;0;;1;C;20;0;1;1;fermé le 31/12/2017;0 +1137;1;MBB;MAJORATION NOURRISSON;1;0;1;1;5;0;;1;C;20;0;1;1;fermé le 31/12/2017;0 +1138;1;RAA;REMUNERATION ADDITIONNELLE CAPI;1;0;1;1;28;0;;1;Z;20;0;47;;fermé le 30/06/2017;0 +1139;1;RAC;REMUNERATION DES ADHERENTS AU CAPI;1;0;1;0;;0;;1;Z;20;0;47;;fermé le 30/06/2012;0 +1140;1;CDE;CONSULTATION SPECIFIQUE DE DEPISTAGE;1;0;1;0;;1;;1;C;20;0;1;1;;0 +1141;1;MPE;MAJORATION PEDIATRE ENFANT;1;0;1;1;5;0;;1;C;20;0;1;1;fermé le 31/12/2017;0 +1142;99;RSO;REMUNERATION ADHESION SOPHIA (SOINS DE VILLE);1;9;99;0;;0;;1;Z;20;0;47;;;0 +1143;99;RSR;REMUNERATION RENOUVELLEMENT SOPHIA (SOINS DE VILLE);1;9;99;0;;0;;1;Z;20;0;47;;;0 +1144;99;RST;REMUNERATION FORFAITAIRE POUR LE SUIVI DES PATIENTS EN POST ALD;1;9;99;0;;0;;1;Z;20;0;47;;fermé le 31/12/2017;0 +1145;99;P4P;REMUNERATION OBJECTIF MEDECIN ET CENTRES DE SANTE;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1174;99;TSM;TELESURVEILLANCE : MEDECIN TELESURVEILLANT;1;9;99;0;;0;;1;C;20;0;1;;;0 +1175;99;DHT;PEC EXCEPTIONNELLE DEPASSEMENT HONORAIRE TP;1;9;99;0;;0;;1;Z;20;0;1;;;0 +1176;99;COI;CONTRAT INDIVIDUEL EMBAUCHE - AVANCE;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1177;99;AFC;AIDE FINANCIERE MATERNITE PATERNITE ADOPTION;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1178;99;P6P;FORFAIT STRUCT. MEDECIN;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1179;99;R6P;FORFAIT STRUCT. MEDECIN COMPLEMENT;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1180;99;P5P;ROSP MT ENFANT;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1181;99;R5P;ROSP MT ENFANT COMPLEMENT;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1182;99;PTM;REMUNERATION OPTAM;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1183;99;CTM;CONTESTATION OPTAM;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1184;99;COS;CONTRAT INDIVIDUEL EMBAUCHE - SOLDE;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1185;99;RNB;MURAINE - PAIEMENT DU BONUS ANNUEL;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1186;99;COC;CONTRAT INDIVIDUEL EMBAUCHE OBJECTIFS COMPLEMENTAIRES;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1187;99;CCI;CONTRAT COLLECTIF AIDE INVESTISSEMENT;1;9;99;0;;0;;0;Z;20;0;47;;;0 +1188;99;CCA;CONTRAT COLLECTIF ATTEINTE DES OBJECTIFS;1;9;99;0;;0;;0;Z;20;0;47;;;0 +1189;99;COF;CONTRAT DE FORMATION;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1190;99;TSP;TELESURVEILLANCE : PRIME VARIABLE;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1191;99;TC;TELECONSULTATION TOUTES SPECIALITES;1;9;99;0;;1;1;1;C;20;0;1;;;0 +1192;99;TCG;TELECONSULTATION GENERALISTE;1;9;99;0;;1;1;1;C;20;0;1;;;0 +1193;99;TE1;TELE EXPERTISE DE NIVEAU 1;1;9;99;0;;1;;1;C;20;0;1;1;;0 +1194;99;TE2;TELE EXPERTISE DE NIVEAU 2;1;9;99;0;;1;;1;C;20;0;1;1;;0 +1209;99;VGS;VISITE SPECIALISTE MEDECINE GENERALE;1;9;99;0;;1;;1;C;20;0;1;1;;0 +1210;99;VG;VISITE MEDECINE GENERALE;1;9;99;0;;1;;1;C;20;0;1;1;;0 +1211;1;V;VISITE COTEE V;1;0;1;0;;1;;1;C;20;0;1;1;;0 +1212;1;VS;VISITE COTEE VS;1;0;1;0;;1;;1;C;20;0;1;1;;0 +1213;1;VNP;VISITE COTEE VNP;1;0;1;0;;1;;1;C;20;0;1;1;;0 +1214;1;VL;VISITE LONGUE ET COMPLEXE;1;0;1;0;;1;;1;C;20;0;1;1;;0 +1215;99;APV;AVIS PONCTUEL DE CONSULTANT MEDECIN (VISITE);1;9;99;0;;1;;1;C;20;0;1;1;;0 +1216;99;AVY;AVIS PONCTUEL DE CONSULTANT PSYCHIATRE (VISITE);1;9;99;0;;1;;1;C;20;0;1;1;;0 +1221;1;VA;VISITE D URGENCE;1;0;1;0;;1;;1;C;20;0;1;1;fermé le 15/09/2014;0 +1222;1;VU;VISITE URGENCE VU/MU;1;0;1;0;;1;;1;C;20;0;1;1;;0 +1223;1;MMD;MAJORATION MAINTIEN A DOMICILE;1;0;1;1;4;0;;1;Z;;0;1;;fermé le 30/09/2002;0 +1224;1;MD;MD (CRITERES MEDICAUX);1;0;1;1;4;0;1;1;Z;20;0;1;1;;0 +1225;1;MDE;MDE ( CRITERES ENVIRONNEMENTAUX);1;0;1;1;4;0;;1;Z;20;0;1;1;fermé le 10/02/2007;0 +1226;1;MDN;MD DE NUIT;1;0;1;1;1;0;1;1;Z;20;0;1;1;;0 +1227;1;MDI;MD DE MILIEU NUIT;1;0;1;1;1;0;1;1;Z;20;0;1;1;;0 +1228;1;MDD;MD DE DIMANCHE ET JOUR FERIES;1;0;1;1;2;0;1;1;Z;20;0;1;1;;0 +1229;1;MEN;MDE DE NUIT;1;0;1;1;1;0;;1;Z;;0;1;;fermé le 31/03/2003;0 +1231;1;MEI;MDE MILIEU DE NUIT;1;0;1;1;1;0;;1;Z;;0;1;;fermé le 31/03/2003;0 +1232;1;MED;MDE DIMANCHE ET JOUR FERIES;1;0;1;1;2;0;;1;Z;;0;1;;fermé le 31/03/2003;0 +1311;1;KC;ACTES EN K CHIRURGICAL;1;0;1;0;;1;1;1;T;20;0;37;1;fermé le 01/12/2014;0 +1312;1;K;ACTES DE SPECIALITE EN K;1;1;;0;;1;1;1;T;20;0;1;1;;0 +1313;1;KA;ACTES EN K D URGENCE;1;0;1;0;;1;;1;T;20;0;1;1;fermé le 01/01/2011;0 +1314;1;KFA;FORFAIT CHIRURGIE 1;1;0;1;1;8;0;;1;Z;20;0;1;;fermé le 31/12/2006;0 +1315;1;KFB;FORFAIT CHIRURGIE 2;1;0;1;1;8;0;;1;Z;20;0;1;;fermé le 31/12/2006;0 +1316;1;KE;ACTES DE DIAGNOSTIC COTES KE;1;1;;0;;1;1;1;T;20;0;1;1;;0 +1317;1;KCC;ACTES EN KCC: ACTES SPECIFIQUES DES CHIRURGIENS;1;0;1;0;;0;;1;T;20;0;1;1;fermé le 31/12/2006;0 +1318;1;KMO;ACTE DE PHONIATRIE PAR MEDECIN;1;0;1;0;;1;;1;T;20;0;1;1;;0 +1319;1;KFC;MAJORATION FORFAIT ACCOUCHEMENT;1;0;1;1;7;0;;1;Z;20;0;1;;fermé le 31/12/2006;0 +1320;1;KFD;FORFAIT RADIOGRAPHIE,ECHOGRAPHIE;1;0;1;1;15;0;;1;T;20;0;1;;fermé le 31/12/2006;0 +1321;99;ADC;ACTE DE CHIRURGIE CCAM;1;9;99;0;;1;1;1;T;20;1;44;1;;0 +1322;1;ACO;ACTE D'OBSTETRIQUE CCAM;1;1;;0;;1;1;1;T;20;0;25;1;;0 +1323;1;ADA;ACTE D'ANESTHESIE CCAM;1;1;;0;;1;1;1;T;20;0;25;1;;0 +1324;1;ADE;ACTE D'ECHOGRAPHIE CCAM;1;1;;0;;1;1;1;T;20;0;25;1;;0 +1331;1;Z;ACTES DE RADIOLOGIE;1;1;;0;;1;1;1;T;20;0;1;1;;0 +1332;1;ZN;ACTES DE RADIOLOGIE NUCLEAIRE;1;0;1;0;;0;;1;T;20;0;1;;fermé le 31/12/2006;0 +1333;1;PRA;MAJORATION POUR PRODUIT RADIOPHARMACEUTIQUE;1;0;1;1;9;0;;1;T;20;0;1;;fermé le 31/12/2006;0 +1334;1;DCS;DEPISTAGE CANCER DU SEIN;1;0;1;0;;0;;1;Z;;0;1;;fermé le 01/01/1999;0 +1335;1;ZM/ADI;ACTE DE RADIOLOGIE MAMMOGRAPHIE;1;0;1;0;;1;;1;T;;0;1;1;ZM : fermé le 31/12/2006;0 +1336;1;ZM DEPISTAGE/ADI;ACTE DE RADIOLOGIE MAMMOGRAPHIE DEPISTAGE;1;0;1;0;;1;;1;T;;0;1;1;ZM : fermé le 31/12/2006;0 +1341;99;P;ACTES D ANATOMO-CYTO-PATHOLOGIE/MEDECINS;1;9;99;0;;1;;1;T;20;0;1;1;fermé le 01/03/2011;0 +1342;99;MAP;MAJORATION ANATOMO-CYTO-PATHOLOGIE;1;9;99;1;19;0;;1;Z;20;0;1;1;fermé le 01/03/2011;0 +1345;1;MTC;MAJORATION TRANSITOIRE;1;0;1;1;14;0;;1;Z;20;0;1;;fermé le 31/12/2006;0 +1351;99;ADI;ACTE D'IMAGERIE (hors ECHOGRAPHIE) CCAM;1;9;99;0;;1;1;1;T;20;1;25;1;;0 +1352;99;ATM;ACTES TECHNIQUES MEDICAUX (hors IMAGERIE) CCAM;1;9;99;0;;1;1;1;T;20;1;25;1;;0 +1361;1;VDC;VIDEOCAPSULE;1;0;1;0;;1;;1;T;20;0;49;;;0 +1400;99;BR1;BRIDGE RAC MODERE;1;9;99;0;;1;;1;T;20;1;42;;;0 +1401;99;PF0;PROTHESE FIXE RAC 0;1;9;99;0;;1;;1;T;20;1;42;;;0 +1402;99;PF1;PROTHESE FIXE RAC MODERE;1;9;99;0;;1;;1;T;20;1;42;;;0 +1403;99;RF0;REPARATION FACETTE PROTHESE AMOVIBLE RAC 0;1;9;99;0;;1;;1;T;20;1;42;;;0 +1411;1;SCM/SPA;ACTES EN SCM (ET SPA POUR LA CRPCEN);1;0;1;0;;1;1;1;T;;1;34;1;SCM : fermé le 01/12/2014;0 +1415;99;CM0;PROTHESE FIXE METALLIQUE RAC 0;1;9;99;0;;1;;1;T;20;1;42;;;0 +1416;99;CT0;COURONNE TRANSITOIRE RAC 0;1;9;99;0;;1;;1;T;20;1;42;;;0 +1417;99;CZ0;COURONNE ZIRCONE RAC 0;1;9;99;0;;1;;1;T;20;1;42;;;0 +1418;99;IC0;INLAY CORE RAC 0;1;9;99;0;;1;;1;T;20;1;42;;;0 +1419;99;PA0;PROTHESE AMOVIBLE RAC 0;1;9;99;0;;1;;1;T;20;1;42;;;0 +1420;99;PT0;PROTHESE AMOVIBLE DE TRANSITION RAC 0;1;9;99;0;;1;;1;T;20;1;42;;;0 +1421;1;PRO;ACTES DE PROTHESE DENTAIRE PRATIQUES PAR LE MEDECIN;1;0;1;0;;1;1;1;T;20;1;35;;fermé le 01/12/2014;0 +1422;1;ORT/EOS;TRAITEMENTS D ORTHODONTIE PRATIQUES PAR LE MEDECIN (ET EOS POUR LA CRPCEN);1;0;1;0;;1;;1;T;20;1;36;;;0 +1423;1;SPR;ACTES DE PROTHESE DENTAIRE PRATIQUES PAR LE CHIRURGIEN-DENTISTE;1;0;2;0;;1;1;1;Z;20;0;35;;fermé le 01/12/2014;0 +1424;1;TO/ETO;TRAITEMENTS D ORTHODONTIE PRATIQUES PAR LE CHIRURGIEN-DENTISTE (ET ETO POUR LA CRPCEN);1;0;2;0;;1;;1;Z;;0;36;;;0 +1425;1;ATD;COMPLEMENT AT 150% DENTAIRE;1;0;99;1;27;0;;1;Z;20;0;35;;;0 +1426;99;DDE;PEC EXCEPTIONNELLE DÉPASSEMENT DENTAIRE;1;9;99;0;;0;;1;Z;20;0;41;;;0 +1427;99;DDT;PEC EXCEPTIONNELLE DEPASSEMENT DENTAIRE TP;1;9;99;0;;0;;1;Z;20;0;41;;;0 +1650;99;CAA;AIDE A L'ACTIVITE COTRAM;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1651;99;MRA;PAIEMENT MAJORATION REMUNERATION ARS COTRAM;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1652;99;RFC;PTMR - REMPLACEMENT;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1653;99;RCR;PTMR - MALADIE, MATERNITE, PATERNITE;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1701;1;CMD;CDS MEDICAL OPTANT FORF DEBUT;1;0;1;0;;0;;0;Z;20;0;0;;;0 +1702;1;CMF;CDS MEDICAL OPTANT FORF FIN;1;0;1;0;;0;;0;Z;20;0;0;;;0 +1703;1;CMT;CDS MEDICAL OPTANT FORFTACITE;1;0;1;0;;0;;0;Z;20;0;0;;;0 +1704;1;CDI;CDS DENTAIRE OPTANT FORF INITIAL;1;0;1;0;;0;;0;Z;20;0;0;;;0 +1705;1;CDS;CDS DENTAIRE OPTANT FORF SUIVI;1;0;1;0;;0;;0;Z;20;0;0;;;0 +1706;1;CDF;CDS DENTAIRE OPTANT FORF FINAL;1;0;1;0;;0;;0;Z;20;0;0;;;0 +1707;1;CID;CDS INFIRMIER OPTANT FORF DEBUT;1;0;1;0;;0;;0;Z;20;0;0;;;0 +1708;1;CIF;CDS INFIRMIER OPTANT FORF FIN;1;0;1;0;;0;;0;Z;20;0;0;;;0 +1711;1;MRD;FORFAIT MEDECIN REFERENT DEBUT DE CONTRAT;1;0;1;0;;0;;1;Z;20;0;46;;fermé le 28/02/2007;0 +1712;1;MRF;FORFAIT MEDECIN REFERENT FIN DE CONTRAT;1;0;1;0;;0;;1;Z;20;0;46;;fermé le 28/02/2007;0 +1713;1;MRI;FORFAIT MEDECIN REFERENT INFORMATISE;1;0;1;0;;0;;1;Z;;0;46;;fermé le 20/10/1997;0 +1715;1;FMC;FORFAIT MENSUEL COORDONNATEUR;1;0;1;0;;0;;1;Z;20;0;47;;;0 +1716;1;FMP;FORFAIT MENSUEL PARTICIPATION;1;0;1;0;;0;;1;Z;20;0;47;;;0 +1717;1;FMS;FORFAIT MENS UEL SOINS;1;0;1;0;;0;;1;C;20;0;46;;;0 +1718;1;FAZ;FORFAIT D'ADHESION ZONE DEFICITAIRE;1;0;1;0;;0;;1;Z;20;0;47;;;0 +1721;1;F01;FORFAIT PROFESSIONNEL (F01) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1722;1;F02;FORFAIT PROFESSIONNEL (F02) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1723;1;F03;FORFAIT PROFESSIONNEL (F03) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1724;1;F04;FORFAIT PROFESSIONNEL (F04) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1725;1;F05;FORFAIT PROFESSIONNEL (F05) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1726;1;F06;FORFAIT PROFESSIONNEL (F06) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1727;1;F07;FORFAIT PROFESSIONNEL (F07) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1728;1;F08;FORFAIT PROFESSIONNEL (F08) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1729;1;F09;FORFAIT PROFESSIONNEL (F09) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1731;1;F10;FORFAIT PROFESSIONNEL (F10) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1732;1;F11;FORFAIT PROFESSIONNEL (F11) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1733;1;F12;FORFAIT PROFESSIONNEL (F12) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1734;1;F13;FORFAIT PROFESSIONNEL (F13) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1735;1;F14;FORFAIT PROFESSIONNEL (F14) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1736;1;F15;FORFAIT PROFESSIONNEL (F15) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1741;1;FC0;FORFAIT CONSULTATION (FC0) FILIERES ET RESEAUX;1;2;;0;;0;;1;Z;20;0;47;;;0 +1742;1;FC1;FORFAIT CONSULTATION (FC1) FILIERES ET RESEAUX;1;2;;0;;0;;1;Z;20;0;47;;;0 +1743;1;FC2;FORFAIT CONSULTATION (FC2) FILIERES ET RESEAUX;1;2;;0;;0;;1;Z;20;0;47;;;0 +1744;1;FC3;FORFAIT CONSULTATION (FC3) FILIERES ET RESEAUX;1;2;;0;;0;;1;Z;20;0;47;;;0 +1745;1;FC4;FORFAIT CONSULTATION (FC4) FILIERES ET RESEAUX;1;2;;0;;0;;1;Z;20;0;47;;;0 +1746;1;FC5;FORFAIT CONSULTATION (FC5) FILIERES ET RESEAUX;1;2;;0;;0;;1;Z;20;0;47;;;0 +1747;1;FC6;FORFAIT CONSULTATION (FC6) FILIERES ET RESEAUX;1;2;;0;;0;;1;Z;20;0;47;;;0 +1748;1;FC7;FORFAIT CONSULTATION (FC7) FILIERES ET RESEAUX;1;2;;0;;0;;1;Z;20;0;47;;;0 +1749;1;FC8;FORFAIT CONSULTATION (FC8) FILIERES ET RESEAUX;1;2;;0;;0;;1;Z;20;0;47;;;0 +1751;1;FC9;FORFAIT CONSULTATION (FC9) FILIERES ET RESEAUX;1;2;;0;;0;;1;Z;20;0;47;;;0 +1761;1;FF0;FORFAIT FORMATION (FF0) FILIERES ET RESEAUX;1;2;;0;;0;;1;Z;20;0;47;;;0 +1762;1;FF1;FORFAIT FORMATION (FF1) FILIERES ET RESEAUX;1;2;;0;;0;;1;Z;20;0;47;;;0 +1763;1;FF2;FORFAIT FORMATION (FF2) FILIERES ET RESEAUX;1;2;;0;;0;;1;Z;20;0;47;;;0 +1764;1;FF3;FORFAIT FORMATION (FF3) FILIERES ET RESEAUX;1;2;;0;;0;;1;Z;20;0;47;;;0 +1765;1;FF4;FORFAIT FORMATION (FF4) FILIERES ET RESEAUX;1;2;;0;;0;;1;Z;20;0;47;;;0 +1766;1;FF5;FORFAIT FORMATION (FF5) FILIERES ET RESEAUX;1;2;;0;;0;;1;Z;20;0;47;;;0 +1767;1;FF6;FORFAIT FORMATION (FF6) FILIERES ET RESEAUX;1;2;;0;;0;;1;Z;20;0;47;;;0 +1768;1;FF7;FORFAIT FORMATION (FF7) FILIERES ET RESEAUX;1;2;;0;;0;;1;Z;20;0;47;;;0 +1769;1;FF8;FORFAIT FORMATION (FF8) FILIERES ET RESEAUX;1;2;;0;;0;;1;Z;20;0;47;;;0 +1771;1;FF9;FORFAIT FORMATION (FF9) FILIERES ET RESEAUX;1;2;;0;;0;;1;Z;20;0;47;;;0 +1781;1;FP0;FORFAIT PREVENTION/DEPISTAGE (FP0) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1782;1;FP1;FORFAIT PREVENTION/DEPISTAGE (FP1) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1783;1;FP2;FORFAIT PREVENTION/DEPISTAGE (FP2) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1784;1;FP3;FORFAIT PREVENTION/DEPISTAGE (FP3) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1785;1;FP4;FORFAIT PREVENTION/DEPISTAGE (FP4) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1786;1;FP5;FORFAIT PREVENTION/DEPISTAGE (FP5) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1787;1;FP6;FORFAIT PREVENTION/DEPISTAGE (FP6) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1788;1;FP7;FORFAIT PREVENTION/DEPISTAGE (FP7) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1789;1;FP8;FORFAIT PREVENTION/DEPISTAGE (FP8) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1791;1;FP9;FORFAIT PREVENTION/DEPISTAGE (FP9) FILIERES ET RESEAUX;1;2;;0;;0;;1;C;20;0;46;;;0 +1811;1;IK IKP;IK PLAINE;1;2;;1;10;0;1;1;Z;;0;55;;;0 +1812;1;IKM;IK MONTAGNE;1;2;;1;10;0;1;1;Z;20;0;55;;;0 +1813;1;IKS;IK PIED SKI;1;2;;1;10;0;1;1;Z;20;0;55;;;0 +1814;1;IKG;FRAIS DE DEPLACEMENT VACATION;1;2;;1;10;0;;1;Z;20;0;55;;;0 +1821;1;ID;ID PARIS LYON MARSEILLE, +100.000 HA, -100.000 HA;1;2;;1;10;0;1;1;Z;20;0;7;;;0 +1841;1;IF;INDEMNITES FORFAITAIRES DE DEPLACEMENT;1;2;;1;10;0;1;1;Z;20;0;7;;;0 +1842;2;IFA;INDEMNITES FORFAITAIRES DE DEPLACEMENT DES AUXILIAIRES MEDICAUX ET ASSIMILES;1;0;3;1;10;0;1;1;Z;20;0;7;;;0 +1843;2;IFO;INDEMNITES FORFAITAIRES DE DEPLACEMENT MK ORTHOPEDIQUE ET RHUMATOLOGIQUE;1;0;3;1;10;0;1;1;Z;20;0;7;;;0 +1844;2;IFR;INDEMNITES FORFAITAIRES DE DEPLACEMENT MK RHUMATISMALE;1;0;3;1;10;0;1;1;Z;20;0;7;;;0 +1845;2;IFN;INDEMNITES FORFAITAIRES DE DEPLACEMENT MK NEUROLOGIQUE;1;0;3;1;10;0;1;1;Z;20;0;7;;;0 +1846;2;IFP;INDEMNITES FORFAITAIRES DE DEPLACEMENT MK PNEUMOLOGIE;1;0;3;1;10;0;1;1;Z;20;0;7;;;0 +1847;2;IFS;INDEMNITES FORFAITAIRES DE DEPLACEMENT DE SORTIE;1;0;3;1;10;0;1;1;Z;20;0;7;;;0 +1903;99;MEG;MAJORATION ENFANT GENERALISTE;1;9;99;1;5;0;;1;C;20;0;1;;;0 +1904;99;MEP;MAJORATION ENFANT PEDIATRE;1;9;99;1;5;0;;1;C;20;0;1;1;;0 +1905;99;NFE;NOUVEAU FORFAIT ENFANT;1;9;99;1;5;0;;1;C;20;0;1;1;;0 +2226;5;SPB;SUPPLEMENT POUR CHAMBRE PLOMBEE;0;0;10;1;13;0;;0;Z;;0;0;;fermé le 30/04/2003;0 +2227;5;SAP;SUPPLEMENT POUR ALIMENTATION PARENTERALE;0;0;10;1;13;0;;0;Z;;0;0;;fermé le 30/04/2003;0 +2229;5;FNN;FORFAIT PRISE EN CHARGE DU NOUVEAU NE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2230;99;PJL;PRIX DE JOURNEE REGIME LOCAL;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2231;5;PHJ;FORFAIT PHARMACEUTIQUE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2232;5;FTO;FORFAIT TRANSPLANTATION D ORGANE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2234;5;ENT;FORFAIT D ENTREE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2235;5;ANP;FORFAIT D ACTIVITE NON PROGRAMMEE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2236;5;FCO;FORFAIT CONSOMMABLE CARDIOLOGIE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2237;5;PJC;PART COMPLEMENTAIRE AIDE MEDICALE ETAT (REGULARISATION CMU COMPLEMENTAIRE);0;0;10;0;;0;;0;Z;22;0;0;;;0 +2238;5;ATU;FORFAIT D ACCUEIL ET DE TRAITEMENT DES URGENCES;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2239;5;FAU;FORFAIT ANNUEL D URGENCE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2240;5;TJC;TARIF JOURNALIER COMPLEMENTAIRE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2241;5;FSO;FRAIS DE SALLE D OPERATION;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2242;5;ARE;FRAIS D ANESTHESIE ET REANIMATION;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2243;5;FE;FRAIS D ENVIRONNEMENT;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2244;99;I02;Forfait Innovation ARGUS II;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2245;5;FST;FRAIS DE SALLE D ACCOUCHEMENT;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2246;5;FSG;FRAIS DE SALLE D ACCOUCHEMENT MULTIPLE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2247;5;FSY;FORFAIT PSYCHIATRIE DE SECURITE - HOSPITALISATION AVEC HEBERGEMENT;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2248;99;I01;Forfait Innovation HIFU;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2249;99;FAI;FORFAIT ACTIVITE ISOLE;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2250;5;FJC;FORFAIT JOURNALIER AIDE MEDICALE (REGULARISATION CMU COMPLEMENTAIRE);0;0;10;0;;0;;0;Z;22;0;0;;;0 +2251;5;FJ;FORFAIT JOURNALIER;0;1;10;1;12;0;;0;Z;22;0;0;;;0 +2252;5;FJA;FORFAIT JOURNALIER DE SORTIE;0;1;10;0;;0;;0;Z;22;0;0;;;0 +2257;5;FJO;FJ TRANSPLANTATION ORGANES;0;1;10;0;;0;;0;Z;22;0;0;;;0 +2258;5;FSJ;FORFAIT DE SOINS JOURNALIER;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2259;5;;SAISIE MANUELLE DES SEJOURS POUR LEQUELS LE FJ EST SUPERIEUR AU TM;0;0;10;0;;0;;0;Z;;0;0;;Saisie manuelle Qualiflux;0 +2260;99;HPC;FORFAIT HOPITAL PROXIMITE COMPLEMENTAIRE;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2261;5;FA1;FORFAIT ACCUEIL DU PATIENT N 1;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2262;5;FA2;FORFAIT ACCUEIL DU PATIENT N 2;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2263;99;IFQ;FORFAIT IFAQ;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2264;99;IFZ;FORFAIT IFAQ SSR;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2265;99;DMA;DOTATION MODULEE A L'ACTIVITE;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2271;5;AS1;FORFAIT HPT GROUPE 1;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2272;5;AS2;FORFAIT HPT GROUPE 1.2;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2273;5;AS3;FORFAIT HPT 15 %;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2274;5;AS4;FORFAIT HPT GROUPE 2 + FAS1;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2275;5;AS5;FORFAIT HPT GROUPE 2 + FAS2;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2282;5;PAT;PARTICIPATION ASSURE TRANSITOIRE;1;0;10;1;16;0;;0;Z;;0;0;;fermé le 30/09/2007;0 +2283;5;PAH;PARTICIPATION ASSURE HOSPITALISATION PUBLIQUE (CMU + AME);0;0;10;1;16;0;;0;Z;22;0;0;;;0 +2284;5;PAJ;PARTICIPATION ASSURE HOSPITALISATION PUBLIQUE (REGIME LOCAL);0;0;10;1;16;0;;0;Z;22;0;0;;;0 +2285;5;PAS;PARTICIPATION ASSURE SUR SEJOUR;0;0;10;1;16;0;;0;Z;22;0;0;;;0 +2311;5;FNO;FORFAIT DE SEANCE DE SOINS SANS OXYGENE;0;0;10;0;;0;;0;Z;;0;0;;fermé le 31/12/1998;0 +2312;5;FOC;FORFAIT POUR INSUFFISANCE RESPIRATOIRE AVEC OXYGENE EXTRACTEUR;0;0;10;0;;0;;0;Z;;0;0;;fermé le 31/12/1998;0 +2313;5;FOB;FORFAIT POUR INSUFFISANCE RESPIRATOIRE AVEC OXYGENE BOUTEILLE;0;0;10;0;;0;;0;Z;;0;0;;fermé le 31/12/1998;0 +2314;5;FOL;FORFAIT POUR INSUFFISANCE RESPIRATOIRE AVEC OXYGENE LIQUIDE;0;0;10;0;;0;;0;Z;;0;0;;fermé le 31/12/1998;0 +2315;5;PPC;APPAREILLAGE VENTILATION PRESSION POSITIVE CONTINUE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2321;5;FPA/FS;FORFAIT - LONG SEJOUR PERSONNES AGEES;0;0;10;0;;0;;0;Z;;0;0;;;0 +2331;5;SNS;RADIUMTHERAPIE ET CHIMIOTHERAPIE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2332;5;RF/SNS;READAPTATION FONCTIONNELLE;0;0;10;0;;0;;0;Z;;0;0;;;0 +2333;5;RP/FS;REEDUCATION PROFESSIONNELLE;0;0;10;0;;0;;0;Z;;0;0;;;0 +1974;1;FRT;FRANCHISE TIERS PAYANT SUR TRANSPORT;1;1;;1;16;0;;1;Z;20;0;52;;;0 +1975;1;FRH;FRANCHISE HORS TIERS PAYANT ACTE D'AUXILIAIRE MEDICAUX;1;1;;1;16;0;;1;Z;20;0;26;;;0 +1976;1;FRT;FRANCHISE TIERS PAYANT ACTE D'AUXILIAIRE MEDICAUX;1;1;;1;16;0;;1;Z;20;0;26;;;0 +1977;1;FRH;PARTICIPATION ASSURE HORS TIERS PAYANT TRANSMISE SANS ACTE DE REFERENCE OU TYPE DE FRANCHISES;1;2;;1;16;0;;1;Z;20;0;26;;;0 +1978;1;FRT;PARTICIPATION ASSURE EN TIERS PAYANT TRANSMISE SANS ACTE DE REFERENCE OU TYPE DE FRANCHISES;1;2;;1;16;0;;1;Z;20;0;26;;;0 +1981;1;FHV;FORFAIT IVG HONORAIRES DE VILLE;1;0;1;0;;1;;1;C;20;0;1;1;;0 +1990;99;FIM;FORFAIT D'INTERVENTION PAR SORTIE SUR DEMANDE DE LA REGULATION;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1991;1;REG;REMUNERATION REGULATION;1;0;1;0;;0;;1;Z;20;0;47;;;0 +1992;1;PRN;PERMANENCE REMUNERATION DE NUIT;1;0;1;0;;0;;1;Z;20;0;31;;;0 +1993;1;PRM;PERMANENCE REMUNERATION MILIEU DE NUIT;1;0;1;0;;0;;1;Z;20;0;31;;;0 +1994;1;PRD;PERMANENCE REMUNERATION DIMANCHE ET FERIE;1;0;1;0;;0;;1;Z;20;0;31;;;0 +1995;1;PRT;PERMANENCE REMUNERATION TOTAL;1;0;1;0;;0;;1;Z;20;0;31;;;0 +1996;1;RSA;PERMANENCE REMUNERATION SAMEDI MATIN;1;0;1;0;;0;;1;Z;20;0;31;;;0 +1997;1;RSP;PERMANENCE REMUNERATION SAMEDI APRES MIDI;1;0;1;0;;0;;1;Z;20;0;31;;;0 +1998;99;AJS;ASTREINTE DE JOUR CORRESPONDANT SAMU;1;9;99;0;;0;;1;Z;20;0;31;;;0 +1999;99;ANS;ASTREINTE DE NUIT CORRESPONDANT SAMU;1;9;99;0;;0;;1;Z;20;0;31;;;0 +2106;99;TDD;TRANSPORT DEFINITIF DIALYSE;0;9;10;1;13;0;;0;Z;22;0;10;;;0 +2107;99;TSD;TRANSPORT SEANCE DIALYSE;0;9;10;1;13;0;;0;Z;22;0;10;;;0 +2108;99;TDE;SUPPLEMENT TRANSPORT 2;0;9;10;0;;0;;0;Z;22;0;10;;;0 +2109;99;TSE;SUPPLEMENT TRANSPORT SEANCES;0;9;10;0;;0;;0;Z;22;0;10;;;0 +2111;5;GHS;FRAIS D HEBERGEMENT ET ENVIRONNEMENT EN GHS;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2112;5;EXH;FRAIS DE SEJOUR SUPPLEMENTAIRE AU GHS;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2113;5;GHT;GROUPE HOMOGENE DE TARIFS;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2114;99;SRA;SUPPLEMENT SOINS PARTICULIEREMENT COUTEUX (ARRETE DU 29/06/1978);0;9;99;0;;0;;0;Z;22;0;0;;fermé le28/02/2009;0 +2115;99;SSC;SUPPLEMENT SOINS CONTINUS;0;9;99;0;;0;;0;Z;22;0;0;;fermé le 28/02/2009;0 +2116;5;NN1;SUPPLEMENT NEONATOLOGIE 1;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2117;5;NN2;SUPPLEMENT NEONATOLOGIE 2;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2118;5;NN3;SUPPLEMENT NEONATOLOGIE 3;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2119;99;SDC;SUPPLEMENT DEFIBRILLATEUR;0;9;10;1;13;0;;0;Z;22;0;0;;;0 +2120;5;DTC;DIFFERENTIEL TARIFAIRE CLINIQUE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2121;5;D01;HEMODIALYSE EN CENTRE OU EN UNITE DE DIALYSE MEDICALISEE;0;0;10;0;;0;;0;Z;;0;0;;fermé le 29/02/2008;0 +2122;5;D02;AUTODIALYSE SIMPLE OU ASSISTEE;0;0;10;0;;0;;0;Z;;0;0;;fermé le 29/02/2008;0 +2142;5;D19;FORFAIT D ENTRAINEMENT A DIALYSE PERITONEALE CONTINUE AMBULATOIRE;0;0;10;0;;0;;0;Z;;0;0;;fermé le 01/03/2013;0 +2143;5;D20;FF D ENTRAINEMENT A LA DIALYSE PERITONEALE AUTOMATISEE A DOMICILE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2144;5;D21;FF D ENTRAINEMENT A LA DIALYSE PERITONEALE CONTINUE AMBULATOIRE A DOMICILE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2145;5;D22;FORFAIT DE DIALYSE PERITONEALE AUTOMATISE POUR HOSPITALISATION DE 3 A 6 JOURS;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2146;5;D23;FORFAIT DE DIALYSE PERITONEALE CONTINUE AMBULATOIRE POUR HOSPITALISATION DE 3 A 6 JOURS;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2147;99;D24;FORFAIT D'ENTRAINEMENT A L'HEMODIALYSE EN UNITE DE DIALYSE MEDICALISEE;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2150;5;VDE;VIDEOCAPSULE;0;0;10;0;;0;;1;T;21;0;10;;;0 +2151;5;REA;SUPPLEMENT REANIMATION;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2152;5;SRC;SUPPLEMENT SURVEILLANCE CONTINUE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2153;5;STF;FORFAIT SOINS INTENSIFS;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2154;5;REP;REANIMATION PEDIATRIQUE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2155;5;SE1;FORFAIT ENVIRONNEMENT HOSPITALIER 1;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2156;5;SE2;FORFAIT ENVIRONNEMENT HOSPITALIER 2;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2157;5;SE3;FORFAIT ENVIRONNEMENT HOSPITALIER 3;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2158;5;SE4;FORFAIT ENVIRONNEMENT HOSPITALIER 4;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2159;5;FSD;FORFAIT DE SECURITE DERMATOLOGIQUE;1;0;10;0;;0;;1;T;20;0;49;;;0 +2160;99;MGS;MISSION D'INTERET GENERAL SSR;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2161;5;MGC;FORFAIT MISSION D INTERET GENERAL D AIDE A LA CONTRACTUALISATION;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2162;5;FHT;FORFAIT HAUTE TECHNICITE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2163;5;DIP;SUPPLEMENT JOURNALIER DIALYSE PERITONEALE;0;0;10;1;13;0;;0;Z;22;0;0;;;0 +2164;5;APE;ADMINISTRATION DE PRODUITS ET PRESTATIONS EN ENVIRONNEMENT HOSPITALIER;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2165;99;FPI;FORFAIT PRESTATION INTERMEDIAIRE;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2166;99;AP2;ADMINISTRATION DE MEDICAMENTS EN ENVIRONNEMENT HOSPITALIER;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2167;99;SE5;FORFAIT ENVIRONNEMENT HOSPITALIER 5;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2168;99;SE6;FORFAIT ENVIRONNEMENT HOSPITALIER 6;0;9;10;0;;0;;0;Z;22;0;10;;;0 +2170;99;FIP;FIR ETS PRIVES;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2171;99;DTA;DIFFERENTIEL TARIFAIRE AME;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2172;99;DTM;DIFFERENTIEL TARIFAIRE MIGRANTS;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2173;99;DPC;DIFFERENTIEL PSY REGLEMENTAIRE;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2174;99;DPA;DIFFERENTIEL PSY AME;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2175;99;DPM;DIFFERENTIEL PSY MIGRANTS;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2176;99;DSC;DIFFERENTIEL SSR REGLEMENTAIRE;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2177;99;DSA;DIFFERENTIEL SSR AME;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2178;99;DSM;DIFFERENTIEL SSR MIGRANTS;0;9;10;0;;0;;0;Z;22;0;0;;;0 +2181;5;PO1;PRELEVEMENT D ORGANE 1;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2182;5;PO2;PRELEVEMENT D ORGANE 2;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2183;5;PO3;PRELEVEMENT D ORGANE 3;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2184;5;CPO;COORDINATION PRELEVEMENT D ORGANES;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2185;5;PO4;PRELEVEMENT D ORGANE 4;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2186;5;PO5;PRELEVEMENT D'ORGANE 5;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2187;5;PO6;PRELEVEMENT D'ORGANE 6;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2188;5;PO7;PRELEVEMENT D'ORGANE 7;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2189;5;PO8;PRELEVEMENT D'ORGANE 8;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2190;5;PO9;PRELEVEMENT D'ORGANE 9;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2191;5;POA;PRELEVEMENT D'ORGANE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2195;5;ANT;SUPPLEMENT ANTEPARTUM;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2196;5;RAP;SUPPLEMENT RADIOTHERAPIE PEDIATRIQUE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2201;6;;BUDGET GLOBAL;0;0;10;0;;0;;0;Z;;0;0;;Pas d information sur la prestation;0 +2202;6;;MEDICALISATION DES PERSONNES AGEES;0;0;10;0;;0;;0;Z;;0;0;;Pas d information sur la prestation;0 +2203;6;;SSAD : SERVICE DE SOINS A DOMICILE;0;0;10;0;;0;;0;Z;;0;0;;Pas d information sur la prestation;0 +2204;6;;CAMSP: CENTRE ACTION MEDICO-SOCIALE PRECOCE;0;0;10;0;;0;;0;Z;;0;0;;Pas d information sur la prestation;0 +2334;5;FSE/SNS;SEANCE D HEMODIALYSE;0;0;10;0;;0;;0;Z;;0;0;;FSE fermé le 28/02/2005;0 +2335;5;FP;FORFAIT DE PANSEMENT;0;0;10;0;;0;;0;Z;21;0;0;;;0 +2336;5;SNS;FORFAIT POUR CONSULTATION EN CENTRE MEDICO-PSYCHO PEDAGOGIQUE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2337;5;SD;SEANCE DE DIAGNOSTIC;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2338;5;FFM;FORFAIT PETIT MATERIEL;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2339;5;FS/SNS;AUTRES FORFAITS DIVERS (Y COMPRIS NUTRITION ENTERALE A DOMICILE);0;0;10;0;;0;;0;Z;;0;0;;;0 +2341;5;SFC;SUPPLEMENT AU FORFAIT CHIMIOTHERAPIE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2342;5;RGD;FORFAIT POUR GARDE DE DEBUT DE NUIT EN ETABLIS. PRIVE;0;0;10;0;;0;;1;Z;20;0;47;;;0 +2343;5;RGN;FORFAIT POUR GARDE DE NUIT OU SAMEDI APRES MIDI EN ETABLIS. PRIVE;0;0;10;0;;0;;1;Z;20;0;47;;;0 +2344;5;FPG;FORFAIT DE GARDE NUIT ET FERIE EN ETABLIS. PRIVE;0;0;10;0;;0;;1;Z;20;0;47;;;0 +2345;5;RAN;FORFAIT POUR ASTREINTE DE DEBUT DE NUIT EN ETABLIS. PRIVE;0;0;10;0;;0;;1;Z;20;0;47;;;0 +2346;5;RAG;FORFAIT POUR ASTREINTE DE NUIT OU SAMEDI APRES MIDI EN ETABLIS. PRIVE;0;0;10;0;;0;;1;Z;20;0;47;;;0 +2347;5;FPA;FORFAIT D'ASTREINTE NUIT ET FERIE EN ETABLIS. PRIVE;0;0;10;0;;0;;1;Z;20;0;47;;;0 +2351;5;FTN;FORFAIT TECHNIQUE NORMAL IRMN -SCANNERS;1;1;10;0;;0;1;0;Z;22;0;0;;;0 +2352;5;FTR;FORFAIT TECHNIQUE REDUIT IRMN -SCANNERS;1;1;10;0;;0;1;0;Z;22;0;0;;;0 +2353;5;FTS;FORFAIT TECHNIQUE SCANNER (SPP expo amiante);1;1;10;0;;0;1;0;Z;22;0;0;;;0 +2354;5;FTG;FORFAIT TECHNIQUE TOMOGRAPHIE;1;1;10;0;;0;1;0;Z;22;0;0;;;0 +2355;5;N01;FORFAIT CONCOMMABLE MEDECINE NUCLEAIRE 01;1;1;10;0;;0;1;0;Z;22;0;0;;;0 +2356;5;N02;FORFAIT CONSOMMABLE MEDECINE NUCLEAIRE 02;1;1;10;0;;0;1;0;Z;22;0;0;;;0 +2357;5;N03;FORFAIT CONSOMMABLE MEDECINE NUCLEAIRE 03;1;1;10;0;;0;1;0;Z;22;0;0;;;0 +2358;5;N04;FORFAIT CONSOMMABLE MEDECINE NUCLEAIRE 04;1;1;10;0;;0;1;0;Z;22;0;0;;;0 +2359;5;N05;FORFAIT CONSOMMABLE MEDECINE NUCLEAIRE 05;1;1;10;0;;0;1;0;Z;22;0;0;;;0 +2360;5;N06;FORFAIT CONSOMMABLE MEDECINE NUCLEAIRE 06;1;1;10;0;;0;1;0;Z;22;0;0;;;0 +2361;5;N07;FORFAIT CONSOMMABLE MEDECINE NUCLEAIRE 07;1;1;10;0;;0;1;0;Z;22;0;0;;;0 +2362;5;N08;FORFAIT CONSOMMABLE MEDECINE NUCLEAIRE 08;1;1;10;0;;0;1;0;Z;22;0;0;;;0 +2363;5;N09;FORFAIT CONSOMMABLE MEDECINE NUCLEAIRE 09;1;1;10;0;;0;1;0;Z;22;0;0;;;0 +2364;5;N10;FORFAIT CONSOMMABLE MEDECINE NUCLEAIRE 10;1;1;10;0;;0;1;0;Z;22;0;0;;;0 +2365;5;N11;FORFAIT CONSOMMABLE MEDECINE NUCLEAIRE 11;1;1;10;0;;0;1;0;Z;22;0;0;;;0 +2366;5;N12;FORFAIT CONSOMMABLE MEDECINE NUCLEAIRE 12;1;1;10;0;;0;1;0;Z;22;0;0;;;0 +2367;5;N13;FORFAIT CONSOMMABLE MEDECINE NUCLEAIRE 13;1;1;10;0;;0;1;0;Z;22;0;0;;;0 +2368;5;N14;FORFAIT CONSOMMABLE MEDECINE NUCLEAIRE 14;1;1;10;0;;0;1;0;Z;22;0;0;;;0 +1906;99;NFP;NOUVEAU FORFAIT PEDIATRIQUE;1;9;99;1;5;0;;1;C;20;0;1;1;;0 +1907;99;PRX;REMUNERATION DES SOINS DE PROXIMITE;1;9;99;0;;0;;0;Z;20;0;0;;;0 +1908;99;SP2;FORFAIT SORTIE PRECOCE ET TEST GUTHRIE;1;9;99;0;;1;;1;Z;20;0;1;1;;0 +1909;99;SP1;FORFAIT SORTIE PRECOCE;1;9;99;0;;1;;1;Z;20;0;1;1;;0 +1910;99;PPS;Plan personnalisé de santé;1;9;99;0;;0;;1;Z;20;0;46;;;0 +1911;1;SF;ACTES DES SAGES-FEMMES;1;1;;0;;1;1;1;Z;20;0;33;1;;0 +1912;1;NA;HONORAIRES NON VENTILABLES INDIVIDUALISES;1;0;1;0;;0;;0;Z;20;0;0;;;0 +1913;1;MM;MAJORATION MILIEU DE NUIT;1;1;1;1;1;0;1;1;Z;20;0;70;1;;0 +1914;1;FPE;FORFAIT PEDIATRIQUE;1;1;;1;5;0;;1;Z;20;0;1;1;fermé le 31/12/2017;0 +1915;1;ASR;FORFAIT ASTREINTE PROF;1;0;1;0;;0;;1;Z;20;0;47;;;0 +1916;1;RPG;REMUNERATION POUR GARDE ETS PRIVES;1;0;1;0;;0;;1;Z;20;0;47;;;0 +1917;1;RPA;REMUNERATION POUR ASTREINTE ETS PRIVES;1;0;1;0;;0;;1;Z;20;0;47;;;0 +1918;1;MU;MAJORATION D'URGENCE;1;0;1;1;3;0;1;1;C;20;0;1;1;;0 +1919;1;;DEPENSES DE MEDECINE FORFAITAIRE (OMNIPRATICIENS ET AUXILIAIRES MEDICAUX);1;0;1;0;;0;;0;Z;;0;0;;Pas d information sur la prestation;0 +1920;99;CIS;CONTRAT DEMOGRAPHIQUE SAGE-FEMME;1;9;99;0;;0;;1;Z;;0;47;;;0 +1921;1;TDR;TEST DE DIAGNOSTIC RAPIDE;1;0;1;0;;0;;0;Z;20;0;0;;;0 +1922;99;CG;CONSULTATION;1;9;99;0;;1;;1;Z;20;0;1;1;fermé le 16/04/2013;0 +1923;1;SP;EXAMEN DE SUIVI POST NATAL;1;1;;0;;1;;1;Z;20;0;1;1;;0 +1924;1;VGM;REMUNERATION VACCINATION GRIPPE A MEDECIN;1;0;1;0;;0;;1;Z;20;0;47;;;0 +1925;1;VMR;REMUNERATION VACCINATION GRIPPE A PAR MEDECIN RETRAITE ET SALARIES HORS OBLIGATIONS;1;0;1;0;;0;;0;Z;20;0;0;;;0 +1926;99;VAC;ACTE DE VACCINATION GRIPPE A/H1N1;1;9;99;0;;0;;1;T;20;0;1;;fermé le 30/09/2010;0 +1931;1;MNP;MAJORATION NOURRISSON PEDIATRE;1;1;;1;17;0;;1;Z;20;0;1;1;fermé le 31/12/2017;0 +1932;1;MNO;MAJORATION NOURRISSON GENERALISTE;1;1;;1;17;0;;1;Z;20;0;1;1;fermé le 31/12/2017;0 +1933;1;CRN;MAJORATION CONSULTATION REGULEE DE NUIT;1;0;1;1;1;0;;1;Z;20;0;1;1;;0 +1934;1;CRM;MAJORATION CONSULTATION REGULEE MILIEU DE NUIT;1;0;1;1;1;0;;1;Z;20;0;1;1;;0 +1935;1;CRD;MAJORATION CONSULTATION REGULEE DIMANCHE, JOURS FERIES ET ASSIMILES;1;0;1;1;2;0;;1;Z;20;0;1;1;;0 +1936;1;VRN;MAJORATION VISITE REGULEE DE NUIT;1;0;1;1;1;0;;1;Z;20;0;1;1;;0 +1937;1;VRM;MAJORATION VISITE REGULEE MILIEU DE NUIT;1;0;1;1;1;0;;1;Z;20;0;1;1;;0 +1938;1;VRD;MAJORATION VISITE REGULEE DE DIMANCHE, JOURS FERIES ET ASSIMILES;1;0;1;1;2;0;;1;Z;20;0;1;1;;0 +1939;99;MSF;MAJORATION SAGE-FEMME;1;9;99;1;22;0;;1;C;20;0;1;1;;0 +1940;99;DSP;FORFAIT SORTIE PRECOCE;1;9;99;1;22;0;;1;C;20;0;1;1;;0 +1941;1;CRS;MAJORATION CONSULTATION REGULEE SAMEDI APRES MIDI;1;0;1;1;2;0;;1;Z;20;0;1;1;;0 +1942;1;VRS;MAJORATION VISITE REGULEE SAMEDI APRES MIDI;1;0;1;1;2;0;;1;Z;20;0;1;1;;0 +1943;99;MUT;MAJORATION URGENCE MT;1;9;99;1;22;0;;1;C;20;0;1;1;;0 +1944;99;MCU;MAJORATION CORRESPONDANT URGENCE;1;9;99;1;22;0;;1;C;20;0;1;1;;0 +1945;99;MRT;MAJORATION MEDECIN TRAITANT REGULATION;1;9;99;1;22;0;;1;C;20;0;1;1;;0 +1951;1;PFH;PARTICIPATION FORFAITAIRE HORS TIERS PAYANT;1;2;;1;16;0;;1;Z;20;0;54;;;0 +1952;1;PFT;PARTICIPATION FORFAITAIRE TIERS PAYANT;1;2;;1;16;0;;1;Z;20;0;54;;;0 +1954;5;PAE;PARTICIPATION ASSURE CONSULTATIONS ET SOINS EXTERNES (CMU + AME);1;0;;1;16;0;;0;Z;20;0;0;;;0 +1955;1;PAL;PARTICIPATION ASSURE CONSULTATIONS ET SOINS EXTERNES (REGIME LOCAL);1;0;;1;16;0;;0;Z;20;0;0;;;0 +1956;1;PAP;PARTICIPATION ASSURE EN AMBULATOIRE;1;0;1;1;16;0;;1;Z;20;0;56;;;0 +1957;99;TMT;MAJORATION HORS PARCOURS DE SOINS;1;9;99;1;24;0;;1;Z;20;0;54;;;0 +1960;99;SGA;SUPPLEMENT DEROGATOIRE SG SUR ACTE PROFESSIONNEL NON REMBOURSABLE (CNMSS);0;9;99;0;;1;;1;Z;20;0;1;;;0 +1961;99;DAP;SUPPLEMENT DEROGATOIRE SG SUR ACTE PROFESSIONNEL REMBOURSABLE (CNMSS);0;9;99;1;29;0;;1;Z;20;0;1;;;0 +1971;1;FRH;FRANCHISE HORS TIERS PAYANT SUR MEDICAMENT;1;1;;1;16;0;;1;Z;20;0;53;;;0 +1972;1;FRT;FRANCHISE TIERS PAYANT SUR MEDICAMENT;1;1;;1;16;0;;1;Z;20;0;53;;;0 +1973;1;FRH;FRANCHISE HORS TIERS PAYANT SUR TRANSPORT;1;1;;1;16;0;;1;Z;20;0;52;;;0 +9743;12;HAT;AIDES A LA DEAMBULATION ET AU TRANSPORT;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9744;12;HAU;AUTRES TYPES D AIDES;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9751;12;SPL;FOURNITURES ET ACCESSOIRES NON REMBOURSABLES;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9752;12;HAD;GARDES MALADES A DOMICILE;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9761;12;SOR;AIDES FINANCIERES INDIVIDUELLES ORTHODONTIE;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9762;12;SEP;AIDES FINANCIERES INDIVIDUELLES PHARMACIE/LPP;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9763;12;DIL;AIDES FINANCIERES A CARACTERE SOCIAL AFFECTEES AU LOGEMENT;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9764;12;DIB;AIDES FINANCIERES A CARACTERE SOCIAL AFFECTEES AUX BESOINS ALIMENTAIRES ET VESTIMENTAIRES;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9765;12;DIS;AIDES FINANCIERES A CARACTERE SOCIAL AFFECTEES AUX FRAIS DE SCOLARISATION;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9766;12;DIT;AIDES FINANCIERES A CARACTERE SOCIAL AFFECTEES AUX REGLEMENTS D'IMPOTS TAXES ET PRIMES;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9767;12;DIA;AIDES FINANCIERES A CARACTERE SOCIAL D'ATTENTE DE VERSEMENT DE REVENUS DE SUBSTITUTION;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9768;12;DIO;AIDES FINANCIERES A CARACTERE SOCIAL AFFECTEES AUX FRAIS D'OBSEQUES;0;0;19;0;;0;;0;Z;21;0;0;;;0 +9413;13;;HYGIENE BUCCO-DENTAIRE N92 (SCELLEMENT DE DEUX MOLAIRES);0;0;20;0;;0;;1;Z;;0;20;;Saisie manuelle Qualiflux;0 +9414;13;;HYGIENE BUCCO-DENTAIRE N93 (SCELLEMENT DE TROIS MOLAIRES);0;0;20;0;;0;;1;Z;;0;20;;Saisie manuelle Qualiflux;0 +9415;13;;HYGIENE BUCCO-DENTAIRE N94 (SCELLEMENT DE QUATRE MOLAIRES);0;0;20;0;;0;;1;Z;;0;20;;Saisie manuelle Qualiflux;0 +9421;13;BDC;PREVENTION BUCCO-DENTAIRE: CONSULTATION;0;0;20;0;;1;;1;Z;20;0;20;;;0 +9422;13;BR2;PREVENTION BUCCO-DENTAIRE: RADIO DEUX CLICHES;0;0;20;0;;1;;1;Z;20;0;20;;;0 +9423;13;BR4;PREVENTION BUCCO-DENTAIRE: RADIO QUATRE CLICHES;0;0;20;0;;1;;1;Z;20;0;20;;;0 +9424;13;RIN;PREVENTION BUCCO-DENTAIRE: RADIO EN IMAGERIE NUMERISEE;0;0;20;0;;0;;1;Z;;0;20;;fermé le 28/02/2005;0 +9425;12;;TM POUR DC (MSA);0;0;19;0;;0;;1;Z;;0;22;;Specifique NTEIR;0 +9426;12;;TM POUR SC(MSA);0;0;19;0;;0;;1;Z;;0;22;;Specifique NTEIR;0 +9427;12;;TM POUR Z(MSA);0;0;19;0;;0;;1;Z;;0;22;;Specifique NTEIR;0 +9429;13;BD2;CAMPAGNE BUCCO DENTAIRE MOCALES;0;0;20;0;;0;;1;Z;20;0;20;;;0 +9430;99;;TM DE L'ACTE DE VACCINATION GRIPPE A/H1N1;0;9;99;0;;0;;1;Z;;0;22;;fermé le 30/09/2010;0 +9431;13;PES;PREVENTION ENTRETIEN DE SANTE;0;0;20;0;;0;;1;Z;20;0;20;;;0 +9432;13;;TM DES INDEMNITES DE DEPLACEMENTS ID et MD;0;0;20;0;;0;;1;Z;;0;22;;;0 +9433;99;;TM DE LA RETINOPATHIE DIABETIQUE;0;9;99;0;;0;;1;Z;;0;22;;;0 +9434;99;DCC;ACTE DE DEPISTAGE DU CANCER COLORECTAL;0;9;99;0;;0;;1;Z;20;0;4;;;0 +9511;13;BDS/EDS;EXAMEN ET BILAN DE SANTE;0;0;20;0;;0;;0;Z;;0;0;;;0 +9512;13;BD5;AUTRES ACTIONS COLLECTIVES DE PREVENTION;0;0;20;0;;0;;0;Z;21;0;0;;;0 +9521;13;PDI;ACTES DE PREVENTION;1;0;20;0;;0;;0;Z;21;0;0;;;0 +9566;13;TNS;TRAITEMENT NICOTINIQUE DE SUBSTITUTION;1;0;20;0;;1;;1;Z;21;0;30;;fermé le 31/12/2018;0 +9567;13;RSO;REMUNERATION ADHESION SOPHIA (PREVENTION);0;0;20;0;;0;;1;Z;20;0;47;;La prestation 9567 est remplace par 1142;0 +9568;13;RSR;REMUNERATION RENOUVELLEMENT SOPHIA (PREVENTION);0;0;20;0;;0;;1;Z;20;0;47;;La prestation 9568 est remplace par 1143;0 +9569;99;RAD;RETOUR DOM. INSUFFISANT CARDIAQUE;0;9;99;0;;1;;1;Z;20;0;3;;;0 +9570;99;BPC;BRONCHO-PNEUMOPATHIE CHRONIQUE OBSTRUCTIVE;0;9;99;0;;0;;1;Z;20;1;3;;;0 +9601;12;MCP;MUTUELLE CHAMBRE PARTICULIERE (CAVIMAC);0;0;19;0;;0;;0;Z;;0;0;;fermé le 13/04/2003;0 +9602;12;MFM;FORFAIT DE SOINS INFIRMIERS MUTUELLE SECTEUR MEDICAL (CAVIMAC);0;0;19;0;;0;;0;Z;;0;0;;fermé le 13/04/2003;0 +9603;12;MFR;FORFAIT DE SOINS INFIRMIERS MUTUELLE SECTEUR REPOS CONVALESCENCE (CAVIMAC);0;0;19;0;;0;;0;Z;;0;0;;fermé le 13/04/2003;0 +9604;12;MLS;MUTUELLE LONG SEJOUR;0;0;19;0;;0;;0;Z;;0;0;;fermé le 13/04/2003;0 +9701;12;SSU;COMPLEMENT D ACTION SOCIALE APPAREIL DE SURDITE (CLERCS ET EMPLOYES DE NOTAIRES, PORT AUTONOME DE BORDEAUX);0;0;19;0;;0;;1;Z;21;0;19;;;0 +9702;12;;COMPLEMENT D ACTION SOCIALE BRIDGE (CLERCS ET EMPLOYES DE NOTAIRES);0;0;19;0;;0;;0;Z;;0;0;;Autre regime NEC;0 +9703;12;SDO;COMPLEMENT D ACTION SOCIALE DENTAIRE (CLERCS ET EMPLOYES DE NOTAIRES);0;0;19;0;;0;;1;Z;20;0;49;;;0 +9805;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9806;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9807;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9808;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9809;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9810;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9811;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9812;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9813;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9814;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9815;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9816;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9817;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9818;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9819;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9820;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9821;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9822;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9823;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9824;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9825;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9826;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9827;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9828;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9829;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9830;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9831;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9832;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9833;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9834;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9835;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9836;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9837;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9838;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +9839;0;;CODES MIS A LA DISPOSITION DE LA SNCF;0;0;;0;;0;;0;Z;;0;0;;Specifique NTEIR;0 +3351;2;PHU;MEDICAMENT AVEC UNE AUTORISATION TEMPORAIRE D'UTILISATION;1;0;5;0;;0;;1;Z;21;0;5;;;0 +3352;2;PHM;PREPARATION MAGISTRALE HOSPITALIRE;1;0;5;0;;0;;1;Z;21;0;5;;;0 +3353;2;PHP;PREPARATION HOSPITALIERE;1;0;5;0;;0;;1;Z;21;0;5;;;0 +3354;2;PHI;MEDICAMENT AVEC AUTORISATION D'IMPORTATION;1;0;5;0;;0;;1;Z;21;0;5;;;0 +3355;2;MAR;MARGE FORFAITAIRE (MEDICAMENTS HOSPITALIERS);1;0;5;0;;0;;1;Z;21;0;12;;;0 +3356;2;PHD;PHARMACIE HOSPITALIERE DEROGATOIRE;1;0;5;0;;0;;1;Z;21;0;5;;;0 +3357;2;PHT;PHARMACIE HOSPITALIERE MMH;1;0;5;0;;0;;1;Z;21;0;5;;;0 +3361;2;GPN;GARDE PHARMACIE NUIT;1;0;5;0;;0;;0;Z;21;0;0;;;0 +3362;2;GPF;GARDE PHARMACIE FERIE;1;0;5;0;;0;;0;Z;21;0;0;;;0 +3363;2;GPD;GARDE PHARMACIE DIMANCHE;1;0;5;0;;0;;0;Z;21;0;0;;;0 +3364;99;HDR;HONORAIRE MEDICAMENT REMBOURSABLE;1;9;99;0;;0;;1;Z;21;0;12;;;0 +3365;99;HDA;HONORAIRE DISPENSATION EN LIEN AVEC AGE;1;9;99;0;;0;;1;Z;21;0;12;;;0 +3366;99;HDE;HONORAIRE DE DISPENSATION MEDICAMENTS SPECIFIQUES;1;9;99;0;;0;;1;Z;21;0;12;;;0 +3374;99;KGP;KIT ANTI-GRIPPE;0;9;99;0;;1;;1;Z;21;0;12;;fermé le 31/05/2010;0 +3375;3;;REMUNERATION PHARMACIENS POUR VACCINS H1N1;0;0;12;0;;1;;1;Z;;0;12;;Pas d information sur la prestation;0 +3378;99;CTR;CONTESTATION ROSP TRANSMISSION RPPS;1;9;99;0;;0;;0;Z;21;0;0;;;0 +3379;99;HDS;Honoraire de dispensation spécifique vaccins anti grippaux Hémisphère Sud;1;9;99;0;;0;;1;Z;21;0;12;;;0 +3380;99;HC;HONO DISP COMP;1;9;99;0;;0;;1;Z;21;0;12;;;0 +3381;2;PPI;PREPARATION PHARMACEUTIQUE INDIVIDUALISEE (ALLERGENES);1;0;5;0;;1;;1;Z;21;0;5;;;0 +3382;99;PDP;PRISE EN CHARGE DEROGATOIRE PHARMACIE;1;9;99;0;;0;;1;Z;21;0;12;;;0 +3383;99;HD1;HONO DISP 1;1;9;99;0;;0;;1;Z;21;0;12;;;0 +3384;99;HD2;HONO DISP 2;1;9;99;0;;0;;1;Z;21;0;12;;;0 +3385;99;HD4;HONO DISP 4;1;9;99;0;;0;;1;Z;21;0;12;;;0 +3386;99;HD7;HONO DISP 7;1;9;99;0;;0;;1;Z;21;0;12;;;0 +3387;99;HG1;HONO DISP GC 1;1;9;99;0;;0;;1;Z;21;0;12;;;0 +3388;99;HG2;HONO DISP GC 2;1;9;99;0;;0;;1;Z;21;0;12;;;0 +3389;99;HG4;HONO DISP GC 4;1;9;99;0;;0;;1;Z;21;0;12;;;0 +3390;99;HG7;HONO DISP GC 7;1;9;99;0;;0;;1;Z;21;0;12;;;0 +3391;99;ROP;REMU OBJECTIF - PHARMACIEN;1;9;99;0;;0;;0;Z;21;0;0;;;0 +3392;99;COP;REMU OBJECTIF - PHARMACIEN COMPLEMENT;1;9;99;0;;0;;0;Z;21;0;0;;;0 +3393;99;AVK;ROSP AVK;1;9;99;0;;0;;0;Z;21;0;0;;;0 +3394;99;CVK;ROSP AVK COMPLEMENT;1;9;99;0;;0;;0;Z;21;0;0;;;0 +3395;99;PPH;Plan personnalisé de santé pharmacie;1;9;99;0;;0;;1;Z;21;0;12;;;0 +3396;99;AHM;ROSP ASTHME;1;9;99;0;;0;;0;Z;21;0;0;;;0 +3397;99;CHM;ROSP ASTHME COMPLEMENT;1;9;99;0;;0;;0;Z;21;0;0;;;0 +3398;99;RTR;ROSP TRANSMISSION RPPS;1;9;99;0;;0;;0;Z;21;0;0;;;0 +3399;3;;REMUNERATION PHARMACIENS POUR VACCINS H1N1 - REGULARISATION COMPTABLE ET SAISIE MANUELLE;0;0;12;0;;0;;1;Z;;0;12;;Saisie manuelle Qualiflux;0 +3411;2;SNG;SANG,PLASMA ET LEURS DERIVES;1;0;5;0;;1;;1;Z;21;0;12;;;0 +3412;2;TSG;TRANSPORT DU PRODUIT;1;0;5;0;;0;;1;Z;21;0;12;;;0 +3413;2;LAI;LAIT HUMAIN;1;0;5;0;;0;;1;Z;21;0;12;;;0 +3414;2;HUM;AUTRES PRODUITS D ORIGINE HUMAINE;1;0;5;0;;0;;1;Z;21;0;12;;;0 +3511;2;AAR;APPAREILS D ASSISTANCE RESPIRATOIRE,OXYGENOTHERAPIE A DOMICILE;1;0;6;0;;1;;1;Z;21;0;6;;;0 +3512;2;AAD;AUTRES MATERIELS POUR TRAITEMENTS A DOMICILE (CHAP. 1);1;0;6;0;;1;;1;Z;21;0;6;;;0 +3513;2;MAC;MATERIELS ET APPAREILS DE CONTENTION ET DE MAINTIEN (CHAP. 2);1;0;6;0;;1;;1;Z;21;0;6;;;0 +3514;2;MAD;MATERIELS ET APPAREILS POUR TRAITEMENTS DIVERS (CHAP. 3);1;0;6;0;;1;;1;Z;21;0;6;;;0 +3515;2;PAN;ARTICLES DE PANSEMENTS (CHAP. 4);1;0;6;0;;1;;1;Z;21;0;6;;;0 +3516;2;GLU;NUTRIMENTS POUR INTOLERANTS AU GLUTEN (CHAP. 3);1;0;6;0;;1;;1;Z;21;0;6;;;0 +3517;2;NUT;ALIMENTS DESTINES A DES FINS MEDICALES;1;0;5;0;;1;;1;Z;21;0;12;;;0 +3518;2;ARO;APPAREIL GENERATEUR D AEROSOL;1;0;6;0;;1;;1;Z;21;0;6;;;0 +3521;2;PA;ORTHESES (PETIT APPAREILLAGE) (CHAP. 1);1;0;6;0;;1;;1;Z;21;0;6;;;0 +3522;2;DVO;DIVERS ORTHESES;1;0;6;0;;1;;1;Z;21;0;6;;;0 +3523;99;OME;FORFAIT MONTURE MOINS DE 18 ANS CMU;1;9;99;0;;1;;1;Z;21;0;6;;;0 +3524;99;OVA;FORFAIT OPTIQUE ENFANT -A- MULTIFOCAUX (CMU);1;9;99;0;;1;;1;Z;21;0;6;;;0 +3525;99;OVB;FORFAIT OPTIQUE ENFANT -B- MULTIFOCAUX (CMU);1;9;99;0;;1;;1;Z;21;0;6;;;0 +3526;99;OP7;FORFAIT OPTIQUE -N° 6 MULTIFOCAUX (CMU);1;9;99;0;;1;;1;Z;21;0;6;;;0 +3527;99;OV1;FORFAIT OPTIQUE -ENFANT-N° 1 UNIFOCAUX (CMU);1;9;99;0;;1;;1;Z;21;0;6;;;0 +3314;2;MX1/PH1;HORMONE DE CROISSANCE;1;0;5;0;;1;;1;Z;;0;5;;MX1 : fermé le 31/03/2003;0 +3315;2;MX4/PH7;MEDICAMENTS ANTIRETROVIRAUX;1;0;5;0;;1;;1;Z;;0;5;;MX4 : fermé le 31/03/2003;0 +3316;2;MX7/PH7;MEDICAMENTS D EXCEPTION;1;0;5;0;;1;;1;Z;;0;5;;MX7 : fermé le 31/03/2003;0 +3317;2;PHH;PHARMACIE HOSPITALIERE A 100%;1;0;5;0;;0;;1;Z;21;0;5;;;0 +3318;2;PHS;PHARMACIE HOSPITALIERE A 65%;1;0;5;0;;0;;1;Z;21;0;5;;;0 +3319;2;PHQ;PHARMACIE HOSPITALIERE;1;0;5;0;;0;;1;Z;21;0;5;;;0 +3320;2;PH8;PHARMACIE HOSPITALIERE EN SUS DU GHS;1;0;5;0;;0;;0;Z;22;0;0;;;0 +3321;2;PHA;FORFAIT PHARMACEUTIQUE EN MATERNITE;1;0;5;1;21;0;;0;Z;21;0;0;;;0 +3322;2;UPH;MAJORATION POUR ACHAT HORS HEURES OUVRABLES;1;0;5;1;21;0;;1;Z;21;0;5;;;0 +3323;2;CPH;COPIE D ORDONNANCE;1;0;5;1;21;0;;1;Z;21;0;5;;fermé le 30/06/2015;0 +3324;2;EMI;ECART MEDICAMENT INDEMNISABLE;1;0;5;1;21;0;;0;Z;22;0;0;;;0 +3325;2;MPI;MAJORATION PHARMACIE DES ILES;1;0;5;1;21;0;;1;Z;21;0;12;;;0 +3326;2;PMR;PREPARATION MAGISTRALE REMBOURSABLE;1;0;5;0;;1;;1;Z;21;0;5;;;0 +3327;2;PMH;PREPARATION MAGISTRALE HOMEOPATHIQUE;1;0;5;0;;1;;1;Z;21;0;5;;;0 +3328;2;MHU;MEDICAMENTS HOMEOPATHIQUES UNITAIRES;1;0;5;0;;1;;1;Z;21;0;5;;;0 +3329;2;FMV;FORFAIT MEDICAMENT IVG VILLE;1;0;5;0;;1;;1;Z;20;0;18;;;0 +3330;99;ERI;ECART INDEMNISABLE RETROCESSION;1;9;99;0;;0;;0;Z;21;0;0;;;0 +3331;2;PH7/ANTI GRIPPE;VACCIN ANTI-GRIPPE;1;1;;0;;1;;1;Z;;0;5;;;0 +3332;2;PH7/ROR;VACCIN ROR;1;1;;0;;1;;1;Z;;0;5;;;0 +3333;2;PH7/ANTI PALUDEEN;ANTI PALUDEEN;1;1;;0;;0;;1;Z;;0;5;;;0 +3334;99;GS1;Vaccin anti grippe Hémisphère Sud VAXIGRIP HS;1;9;99;0;;1;;1;Z;21;0;5;;;0 +3335;99;GS2;Vaccin anti grippe Hémisphère Sud FLUARIX HS;1;9;99;0;;1;;1;Z;21;0;5;;;0 +3336;99;PHX;PHARMACIE SOUS ATU SEJOUR;1;9;99;0;;0;;1;Z;22;0;5;;;0 +3337;99;FFC;FORFAIT FAUSSE COUCHE VILLE;1;9;99;0;;1;;1;Z;20;0;46;;;0 +3338;99;FFV;FORFAIT FAUSSE COUCHE VILLE SANS ÉCHOGRAPHIE;1;9;99;0;;1;;1;Z;20;0;46;;;0 +3339;99;FEF;FORFAIT FAUSSE COUCHE ETABLISSEMENT AVEC ECHOGRAPHIE;1;9;99;0;;1;;1;Z;20;0;46;;;0 +3340;99;FFE;FORFAIT FAUSSE COUCHE ETABLISSEMENT SANS ECHOGRAPHIE;1;9;99;0;;1;;1;Z;20;0;46;;;0 +3341;2;PH2;PHARMACIE 15%;1;0;5;0;;1;;1;Z;21;0;28;;;0 +3342;2;PM2;PREPARATION MAGISTRALE ALLOPATHIQUE 15 %;1;0;5;0;;1;;1;Z;21;0;5;;;0 +3343;2;PM4;PREPARATION MAGISTRALE ALLOPATHIQUE;1;0;5;0;;1;;1;Z;21;0;5;;;0 +4112;1;THR;FORFAIT DE SURVEILLANCE MEDICALE REDUIT 2EME HANDICAP;1;0;1;0;;1;;1;C;20;1;1;;fermé le 31/12/1999;0 +4113;1;KTH;PRATIQUES MEDICALES COMPLEMENTAIRES COTEES EN K;1;0;1;0;;1;;1;T;20;1;1;1;;0 +4114;1;CST;COMPLEMENT SURVEILLANCE THERMALE;1;0;1;1;11;0;;1;Z;20;1;1;1;fermé le 01/01/2015;0 +4121;3;FTH;FORFAITS EN ETABLISSEMENT (DATE FIN 12/98);1;0;8;0;;0;;0;Z;;1;0;;fermé le 31/12/1999;0 +4122;3;CTH;SUPPLEMENTS THERMAUX 1ER HANDICAP (DATE FIN 12/98);1;0;8;0;;0;;0;Z;;1;0;;fermé le 31/12/1999;0 +4123;3;THS;SUPPLEMENTS THERMAUX 2EME HANDICAP (DATE FIN 12/98);1;0;8;0;;0;;0;Z;;1;0;;fermé le 31/12/1999;0 +4131;3;TTH;FRAIS DE TRANSPORT - CURES THERMALES;1;1;;0;;0;;0;Z;21;1;0;;;0 +4132;3;HTH;FRAIS D HOTEL - CURES THERMALES;1;0;7;0;;0;;0;Z;21;1;0;;;0 +4141;3;TH1;FORFAIT THERMAL 1;1;0;8;0;;0;;1;Z;21;1;8;;;0 +4142;3;TH2;FORFAIT THERMAL 2 AVEC KINESITEHRAPIE;1;0;8;0;;0;;1;Z;21;1;8;;;0 +4143;3;TH3;FORFAIT THERMAL 2EME ORIENTATION;1;0;8;0;;0;;1;Z;21;1;8;;;0 +4144;3;TH4;FORFAIT THERMAL 3 AVEC 9 SEANCES KINESITHERAPIE;1;0;8;0;;0;;1;Z;21;1;8;;;0 +4145;3;TH5;FORFAIT THERMAL 72 SEANCES AVEC KINE;1;0;8;0;;0;;1;Z;21;0;8;;;0 +4151;3;MK1;FORFAIT THERMAL 18 SEANCES COLLECTIVES;1;0;8;0;;0;;1;Z;21;1;8;;;0 +4152;3;MK2;FORFAIT THERMAL 18 SEANCES INDIVIDUELLES;1;0;8;0;;0;;1;Z;21;1;8;;;0 +4153;3;MK3;FORFAIT THERMAL 9 SEANCES COLLECTIVES;1;0;8;0;;0;;1;Z;21;1;8;;;0 +4154;3;MK4;FORFAIT THERMAL 9 SEANCES INDIVIDUELLES;1;0;8;0;;0;;1;Z;21;1;8;;;0 +4155;99;FSB;THERMAL-SEVRAGE DES PSYCHOTROPES;1;9;99;0;;0;;1;C;21;1;8;;;0 +4156;99;FKS;THERMAL-SUITE CANCER DU SEIN;1;9;99;0;;0;;1;C;21;1;8;;;0 +4157;99;TC1;THERMAL-TROUBLE COMPORTEMENT 72 SEANCES;1;9;99;0;;0;;1;C;21;1;8;;;0 +4158;99;TC2;THERMAL-TROUBLE COMPORTEMENT 108 SEANCES;1;9;99;0;;0;;1;C;21;1;8;;;0 +4204;99;FUE;FORFAIT TRANSPORT URGENCE EXTRAMUROS CPAM MEUSE;1;9;99;0;;0;;1;Z;21;1;93;;;0 +4205;99;FUI;FORFAIT TRANSPORT URGENCE INTRAMUROS CPAM MEUSE;1;9;99;0;;0;;1;Z;21;1;93;;;0 +4206;99;AFG;PRESTATION FIN DE GARDE AMBULANCE;1;9;99;0;;0;;1;Z;21;1;93;;;0 +4207;99;FTU;FORFAIT TRANSPORT D'URGENCE EXPERIMENTATION CPAM AUDE;1;9;99;0;;0;;1;Z;21;1;93;;;0 +4208;99;FUS;FORFAIT D'URGENCE SUR APPEL DU SAMU EXPERIMENTATION CPAM BOUCHES-DU-RHONE;1;9;99;0;;0;;1;Z;21;1;93;;;0 +4209;99;CTU;COMPLEMENT TRANSPORTS D'URGENCE;1;9;99;0;;0;;1;Z;20;0;50;;;0 +4210;99;TXA;Taxi tarif A;1;9;99;0;;0;;1;Z;21;0;92;;;0 +4211;3;SMU;SERVICES MOBILES D URGENCE ET DE REANIMATION (SMUR);1;0;7;0;;0;;1;Z;21;1;23;;;0 +4212;3;ABA;AMBULANCES AGREEES;1;0;7;0;;0;;1;Z;21;1;90;;;0 +4213;3;VSL;VEHICULES SANITAIRES LEGERS (VSL);1;0;7;0;;0;;1;Z;21;1;91;;;0 +4214;3;TXI;TAXIS;1;0;7;0;;0;;1;Z;21;1;92;;;0 +3528;99;OV2;FORFAIT OPTIQUE -ENFANT-N° 2 UNIFOCAUX (CMU);1;9;99;0;;1;;1;Z;21;0;6;;;0 +3529;99;OV3;FORFAIT OPTIQUE -ENFANT-N° 3 UNIFOCAUX (CMU);1;9;99;0;;1;;1;Z;21;0;6;;;0 +3530;99;OV4;FORFAIT OPTIQUE -ENFANT-N° 4 UNIFOCAUX (CMU);1;9;99;0;;1;;1;Z;21;0;6;;;0 +3531;2;OPT;OPTIQUE MEDICALE PROPREMENT DIT;1;0;6;0;;1;;1;Z;;0;6;;;0 +3532;2;LUN/LNE;MONTURE/LUNETTE POUR ENFANT DE - DE 18 ANS CRPCEN (HORS CODAGE LPP);1;0;6;0;;0;;1;Z;;0;6;;;0 +3533;2;VER/OPE;VERRES / VERRES POUR ENFANT<18 ans -CRPCEN- (hors codage LPP);1;0;6;0;;1;;1;Z;21;0;6;;;0 +3534;2;OPR/ROM/ROV;REPARATION;1;0;6;0;;0;;1;Z;;0;6;;OPR: fermé le 01/01/2000;0 +3535;2;LEN;LENTILLES;1;0;6;0;;1;;1;Z;21;0;6;;;0 +3536;2;OP1;VERRES UNIFOCAUX OP1 (CMU);1;0;6;0;;1;;1;Z;21;0;6;;;0 +3537;2;OP2;VERRES UNIFOCAUX OP2 (CMU);1;0;6;0;;1;;1;Z;21;0;6;;;0 +3538;2;OP3;VERRES UNIFOCAUX OP3 (CMU);1;0;6;0;;1;;1;Z;21;0;6;;;0 +3539;2;OP4;VERRES UNIFOCAUX OP4 (CMU);1;0;6;0;;1;;1;Z;21;0;6;;;0 +3540;99;PAZ;PROTHESE AUDITIVE RAC ZERO;1;9;99;0;;1;;1;T;21;0;6;;;0 +3541;2;PAU/AUA;APPAREILS ELECTRONIQUES DE SURDITE (CHAP 3.);1;0;6;0;;1;;1;Z;;0;6;;;0 +3542;2;PEX;PROTHESES EXTERNES NON ORTHOPEDIQUES (CHAP. 4);1;0;6;0;;1;;1;Z;21;0;6;;;0 +3543;2;POC;PROTHESES OCULAIRES ET FACIALES (CHAP. 5);1;0;6;0;;0;;1;Z;21;0;6;;;0 +3544;2;COR;CHAUSSURES ORTHOPEDIQUES (CHAP. 6);1;0;6;0;;1;;1;Z;21;0;6;;;0 +3545;2;ORP;ORTHOPROTHESES (CHAP 7.);1;0;6;0;;1;;1;Z;21;0;6;;;0 +3546;2;ORC;ACCESSOIRES DE PROTHESES ET D ORTHOPEDIE (CENTRES D APPAR.) (CHAP.8);1;0;6;0;;0;;1;Z;;0;6;;fermé le 01/11/2000;0 +3547;2;AUP;APPAREILS ELECTRONIQUES DE SURDITE (CONSOMMABLES Y.C. PILES);1;0;6;0;;0;;1;Z;21;0;6;;;0 +3548;2;OPC;ORTHOPROTHESES COUTEUSES;1;0;6;0;;0;;1;Z;21;0;6;;;0 +3549;99;PIO;PROCESSEUR POUR IMPLANT OSTE-INTEGRE;1;9;99;0;;1;;1;Z;21;0;6;;;0 +3550;99;SUI;PROTHESE AUDITIVE SUIVI;1;9;99;0;;1;;1;T;21;0;6;;;0 +3551;2;PII;IMPLANT INTERNE (CHAP. 1, 2 ET 3);1;0;6;0;;1;;1;Z;21;0;6;;;0 +3552;2;PME;IMPLANT MU PAR ELECTRICITE (CHAP. 4);1;0;6;0;;0;;1;Z;21;0;6;;;0 +3553;99;OV5;FORFAIT OPTIQUE -ENFANT-N° 5 UNIFOCAUX (CMU);1;9;99;0;;1;;1;Z;21;0;6;;;0 +3554;99;OV6;FORFAIT OPTIQUE -ENFANT-N° 6 UNIFOCAUX (CMU);1;9;99;0;;1;;1;Z;21;0;6;;;0 +3555;99;OV7;FORFAIT OPTIQUE -ENFANT-N° 7 UNIFOCAUX (CMU);1;9;99;0;;1;;1;Z;21;0;6;;;0 +3556;99;OV8;FORFAIT OPTIQUE -ENFANT-N° 8 UNIFOCAUX (CMU);1;9;99;0;;1;;1;Z;21;0;6;;;0 +3557;99;OV9;FORFAIT OPTIQUE -ENFANT-N° 9 UNIFOCAUX (CMU);1;9;99;0;;1;;1;Z;21;0;6;;;0 +3561;2;VEH;VEHICULES POUR HANDICAPES PHYSIQUES;1;0;6;0;;1;;1;Z;21;0;6;;;0 +3571;2;FGA;FRAIS DE GESTION APPAREILLAGE;1;0;6;0;;0;;1;Z;;0;6;;fermé le 30/03/2001;0 +3572;2;ETI;ECART TIPS INDEMNISABLE;1;0;6;0;;0;;1;Z;21;0;6;;;0 +3573;2;FED;FOURNITURE ET EQUIPEMENT DEROGATOIRES;1;0;6;0;;0;;0;Z;21;0;0;;;0 +3574;2;PDM;DISPOSITIF MEDICAL (PRISE EN CHARGE EXCEPTIONNELLE);1;0;6;0;;1;;1;Z;21;0;6;;;0 +3575;2;ATL;COMPLEMENT AT 150% LPP;1;0;99;1;27;0;;1;Z;21;0;6;;;0 +3576;3;PPP;PRESTATION PARTICULIERE ET PANDEMIE;0;0;99;0;;0;;0;Z;21;0;0;;;0 +3581;2;OP5;VERRES UNIFOCAUX OP5(CMU);1;0;6;0;;1;;1;Z;21;0;6;;;0 +3582;2;OP6;VERRES UNIFOCAUX OP6(CMU);1;0;6;0;;1;;1;Z;21;0;6;;;0 +3583;2;OPM;MONTURE (CMU);1;0;6;0;;1;;1;Z;21;0;6;;;0 +3591;99;PCD;PRISE EN CHARGE DEROGATOIRE LPP;1;9;99;0;;0;;1;Z;21;0;6;;;0 +3592;99;DLE;PEC EXCEPTIONNELLE DÉPASSEMENT LPP;1;9;99;0;;0;;1;Z;21;0;6;;;0 +3593;99;TSF;TELESURVEILLANCE : FOURNISSEUR DE LA SOLUTION;1;9;99;0;;0;;1;Z;21;0;1;;;0 +3594;99;DLT;PEC EXCEPTIONNELLE DEPASSEMENT LPP TP;1;9;99;0;;0;;1;Z;21;0;6;;;0 +3610;99;SGN;SUPPLEMENT DEROGATOIRE SG SUR PRESTATION PHARMACIE NON REMBOURSABLE (CNMSS);0;9;99;1;29;0;;1;Z;20;0;5;;;0 +3611;99;SGS;SUPPLEMENT DEROGATOIRE SG SUR PRESTATION SANITAIRE NON REMBOURSABLE (CNMSS);0;9;99;0;;1;;1;Z;20;0;6;;;0 +3612;99;DPS;SUPPLEMENT DEROGATOIRE SG SUR PRESTATION SANITAIRE REMBOURSABLE (CNMSS);0;9;99;1;29;0;;1;Z;20;0;6;;;0 +4111;1;STH;FORFAIT DE SURVEILLANCE MEDICALE 1ER HANDICAP;1;0;1;0;;1;;1;C;20;1;1;1;;0 +4316;3;EXP;EXPERTISE;1;0;16;0;;1;;1;Z;20;0;16;;;0 +4317;3;HMP;HONORAIRE COMITE REGIONAL RECONNAISSANCE Maladie Professionnelle;0;0;16;0;;0;;1;Z;20;0;47;;;0 +4318;3;DPH;DEPLACEMENT COMITE REGIONAL RECONNAISSANCE MP;0;0;16;1;10;0;;1;Z;20;0;7;;;0 +4319;3;DPE;DEPLACEMENT PERSONNE ENTENDUE (CRRMP);0;0;16;1;10;0;;0;Z;21;0;0;;;0 +4320;3;ECP;AVIS SAPITEUR;0;0;16;0;;0;;1;Z;20;0;47;;;0 +4321;3;FUN;FRAIS FUNERAIRES;1;0;12;0;;0;;0;Z;21;1;0;;;0 +4322;3;TRC;TRANSPORT DU CORPS;1;0;12;0;;0;;0;Z;21;1;0;;;0 +4323;3;PDO;INDEMNITE ALLOUEE EN REPARATION DES PREJUDICES EXTRA-PATRIMOMIAUX;1;0;12;0;;0;;0;Z;21;1;0;;;0 +4324;3;ICE;INDEMNITE DE CHANGEMENT D EMPLOI;1;0;12;0;;0;;0;Z;18;0;0;;;0 +4325;3;ICR;INDEMNITE COMPLEMENTAIRE POUR REEDUCATION PROFESSIONNELLE;1;0;12;0;;0;;0;Z;18;0;0;;;0 +4326;3;PPU;PRETIUM PULCHRITUDINIS;1;0;12;0;;0;;0;Z;21;0;0;;;0 +4327;3;PSE;PRETIUM SEXUALE;1;0;12;0;;0;;0;Z;21;0;0;;;0 +4328;3;PAG;PREJUDICE D'AGREMENT;1;0;12;0;;0;;0;Z;21;0;0;;;0 +4329;3;PPD;PREJUDICE PERTE OU DIMINUTION PROMOTION PROFESSIONNELLE;1;0;12;0;;0;;0;Z;21;0;0;;;0 +4330;99;PNC;PREJUDICES EXTRA PATRIMONIAUX - HORS IV;1;9;99;0;;0;;0;Z;21;0;0;;;0 +4331;3;ETR;REMBOURSEMENTS DE SOINS A L ETRANGER (ET1 A ET6 ET ET8 A ET9, ETB ETA, ETH, ETT, ETP, ETX );1;0;12;0;;0;;0;Z;21;0;0;;;0 +4332;3;PPA;PREJUDICE AMIANTE;0;0;12;0;;0;;0;Z;21;0;0;;;0 +4339;3;RES;REMBOURSEMENT DE SOINS;1;0;12;0;;0;;0;Z;21;0;0;;;0 +4341;3;;FRAIS DE TUTELLE;0;0;12;0;;0;;0;Z;;0;0;;Saisie manuelle Qualiflux;0 +4342;3;CRF;COTISATIONS A.T. EN CAS DE READAPTATION FONCTIONNELLE;1;0;12;0;;0;;0;Z;21;0;0;;;0 +4343;3;CRP;COTISATIONS A.T. EN CAS DE REEDUCATION PROFESSIONNELLE;1;0;12;0;;0;;0;Z;21;0;0;;;0 +4351;3;PFR;PRIME DE FIN DE REEDUCATION;1;0;12;0;;0;;0;Z;21;0;0;;;0 +4352;3;;AUTRES PRESTATIONS DIVERSES;1;0;12;0;;0;;0;Z;;0;0;;Pas d information sur la prestation;0 +4353;3;IPS;INDEMNITE POUR PERTE DE SALAIRE (MALADIE, AT);1;1;;0;;0;;0;Z;21;0;0;;;0 +4359;99;IUS;FORFAIT UTILISATION DES TELESERVICES;0;9;99;0;;0;;1;Z;Z;0;47;;;0 +4360;99;FPT;FORFAIT PARTICIPATION A LA TELETRANSMISSION;0;9;99;0;;0;;1;Z;Z;0;47;;;0 +4361;99;FFS;FACTURATION FEUILLE DE SOINS (POUR INFORMATION);0;9;99;0;;0;;1;Z;Z;0;47;;;0 +4362;3;AST;ASTREINTE MEDECIN (POUR INFORMATION);0;0;12;0;;0;;1;Z;;0;31;;fermé le 31/12/2002;0 +4363;99;FFN;AIDE A LA FACTURATION FEUILLE DES FLUX NON SECURISES;0;9;99;0;;0;;0;Z;Z;0;0;;;0 +4364;99;AMT;AIDE MAINTENANCE TELETRANSMISSION;0;9;99;0;;0;;1;Z;Z;0;47;;;0 +4365;99;APT;AIDE PORTABLE TELETRANSMISSION;0;9;99;0;;0;;1;Z;Z;0;47;;;0 +4366;99;ADT;AIDE DEMARRAGE TELETRANSMISSION;0;9;99;0;;0;;1;Z;Z;0;47;;;0 +4367;99;ARD;AIDE ADHESION RAPIDE DISPOSITIF;0;9;99;0;;0;;1;Z;Z;0;47;;;0 +4368;99;FCS;FORFAIT STRUCTURE CENTRE DE SANTE;0;9;99;0;;0;;0;Z;Z;0;0;;;0 +4369;99;FPS;FORFAIT PROFESSIONNEL DE SANTE CENTRE DE SANTE;0;9;99;0;;0;;0;Z;Z;0;0;;;0 +4370;99;IFT;FORFAIT D'INCITATION FORFAITAIRE A LA NUMERISATION ET A LA TRANSMISSION;0;9;99;0;;0;;0;Z;;0;0;;;0 +4371;3;FSM;FORFAIT DE SOINS MEDICALISES (REGIME DES MINISTRES DES CULTES ET DES MEMBRES CONGR. ET COLL.RELIGIEUSES);1;0;12;0;;0;;0;Z;21;0;0;;;0 +4372;3;FRC;FORFAIT REPOS CONVALESCENCE (REGIME DES MINISTRES DES CULTES ET DES MEMBRES CONGR. ET COLL.RELIGIEUSES);1;0;12;0;;0;;0;Z;21;0;0;;;0 +4373;3;EDS;EXAMEN DE SANTE;1;0;12;0;;0;;0;Z;21;0;0;;;0 +4374;99;MDS;MÉCANISME DE COMPENSATION AUX CENTRES DE SANTÉ;1;9;99;0;;0;;0;Z;20;0;0;0;;0 +4375;3;OMJ;AIDE OUTIL DE MISE A JOUR;0;0;12;0;;0;;0;Z;21;0;0;;;0 +4376;3;LTD;LIGNE TELEPHONIQUE DEDIEE;0;0;12;0;;0;;0;Z;21;0;0;;;0 +4377;3;PSM;PIED SUPPORT MATERIEL;0;0;12;0;;0;;0;Z;21;0;0;;;0 +4378;3;SOD;SUPPLEMENT OFFICINE DOM;0;0;12;0;;0;;0;Z;21;0;0;;;0 +4379;99;ADS;AVANCE RÉMUNÉRATION SPÉCIFIQUE CENTRES DE SANTÉ;1;9;99;0;;0;;0;Z;20;0;0;0;;0 +4380;99;SDS;SOLDE RÉMUNÉRATION SPÉCIFIQUE CENTRES DE SANTÉ;1;9;99;0;;0;;0;Z;20;0;0;0;;0 +4381;3;HN;ACTES NON NOMENCLATURE;0;0;22;0;;1;;1;Z;20;0;60;;;0 +4382;3;PHN;PHARMACIE NON REMBOURSABLE;0;0;22;0;;1;;1;Z;21;0;45;;;0 +4391;3;RCP;RESPONSABILITE CIVILE PROFESSIONNELLE;0;0;12;0;;0;;1;Z;;0;48;;fermé le 06/04/2006;0 +4392;3;RCO;RESPONSABILITE CIVILE ECHOGRAPHIE OBSTETRICALE;0;0;12;0;;0;;1;Z;20;0;48;;;0 +4393;3;RC1;RESPONSABILITE CIVILE CHIRURGIE 1;0;0;12;0;;0;;1;Z;20;0;48;;;0 +4394;3;RC2;RESPONSABILITE CIVILE CHIRURGIE 2;0;0;12;0;;0;;1;Z;20;0;48;;;0 +4395;3;RCA;RESPONSABILITE CIVILE ANESTHESIE REANIMATION;0;0;12;0;;0;;1;Z;20;0;48;;;0 +4396;3;PRS;PRIME RESPONSABILITE SPECIALISTE;0;0;12;0;;0;;1;Z;20;0;48;;;0 +4397;3;ACR;PRIME ACCREDITATION SPECIALISTE;0;0;12;0;;0;;1;Z;20;0;0;;;0 +4411;3;;AIDE SOCIALE;1;0;11;0;;0;;0;Z;;0;0;;Saisie manuelle Qualiflux;0 +4412;3;;DISPENSAIRES ANTITUBERCULEUX;1;0;11;0;;0;;0;Z;;0;0;;Saisie manuelle Qualiflux;0 +4413;3;;DISPENSAIRES ANTIVENERIENS;1;0;11;0;;0;;0;Z;;0;0;;Saisie manuelle Qualiflux;0 +4414;3;;HYGIENE MENTALE;1;0;11;0;;0;;0;Z;;0;0;;Saisie manuelle Qualiflux;0 +4415;3;;ETABLISSEMENTS DE LUTTE CONTRE LA TUBERCULOSE;1;0;11;0;;0;;0;Z;;0;0;;Saisie manuelle Qualiflux;0 +4416;3;;PROTECTION MATERNELLE ET INFANTILE;1;0;11;0;;0;;0;Z;;0;0;;Saisie manuelle Qualiflux;0 +4417;3;;AUTRES PARTICIPATIONS FORFAITAIRES NON INDIVIDUALISEES;1;0;11;0;;0;;0;Z;;0;0;;Saisie manuelle Qualiflux;0 +4215;3;VP;VEHICULES PERSONNELS;1;0;7;0;;0;;1;Z;21;1;23;;;0 +4216;3;TRP;TRANSPORT REEDUCATION PROFESSIONNEL;1;0;7;0;;0;;0;Z;21;1;0;;;0 +4217;99;TXB;Taxi tarif B;1;9;99;0;;0;;1;Z;21;0;92;;;0 +4218;99;TXC;Taxi tarif C;1;9;99;0;;0;;1;Z;21;0;92;;;0 +4219;3;ATP;AUTRES MODES DE TRANSPORT;1;0;7;0;;0;;1;Z;21;1;23;;;0 +4220;99;TXD;Taxi tarif D;1;9;99;0;;0;;1;Z;21;0;92;;;0 +4221;3;ABG;AMBULANCE AGREEE DE GARDE;1;0;7;0;;0;;1;Z;21;1;93;;;0 +4222;3;ING;INDEMNITE DE GARDE AMBULANCIERE;1;0;7;0;;0;;1;Z;21;1;50;;;0 +4223;3;;PART ASSOCIATION TRANSPORTEUR;1;0;7;0;;0;;0;Z;;1;0;;Saisie manuelle Qualiflux;0 +4224;3;PGE;PRATIQUE DE GEO LOCALISATION PAR DISPOSITIF EMBARQUE;1;0;7;0;;0;;1;Z;20;0;50;;;0 +4225;3;TS2;FORFAIT TRANSPORT PARTAGE PAR 2 PERSONNES;1;0;7;0;;0;;0;Z;21;0;0;;;0 +4226;3;TS3;FORFAIT TRANSPORT PARTAGE PAR 3 PERSONNES;1;0;7;0;;0;;0;Z;21;0;0;;;0 +4227;99;CAQ;CONTRAT D'AMELIORATION DE LA QUALITE ET DE LA COORDINATION DES SOINS;1;9;99;0;;0;;1;Z;20;0;51;;;0 +4228;99;CAC;CONTRAT D'AMELIORATION DE LA QUALITE ET DE LA COORDINATION DES SOINS COMPLEMENT;1;9;99;0;;0;;1;Z;20;0;51;;;0 +4229;99;TXF;Taxi tarif F;1;9;99;0;;0;;1;Z;21;0;92;;;0 +4311;3;DEL;FRAIS DE DEPLACEMENT - ENQU'TE LEGALE AT;0;0;16;1;10;0;;1;Z;;0;7;;fermé le 29/02/2008;0 +4312;3;DCM;FRAIS DE DEPLACEMENT - COLLEGE 3 MEDECINS;0;0;16;1;10;0;;1;Z;20;0;7;;fermé le 01/09/2009;0 +4313;3;HCM;HONORAIRES;0;0;16;0;;0;;1;C;20;0;16;;fermé le 01/09/2009;0 +4314;3;ENQ;ENQUETE;0;0;16;0;;0;;1;Z;;0;16;;fermé le 29/02/2008;0 +4315;3;AUT;AUTOPSIE;0;0;16;0;;0;;0;Z;20;0;0;;;0 +4419;3;;AUTRES DEPENSES NON INDIVIDUALISEES;1;0;11;0;;0;;0;Z;;0;0;;Saisie manuelle Qualiflux;0 +4501;3;CNT;CONTROLES MEDICAUX (CLERCS ET EMPLOYES DE NOTAIRES);0;0;12;0;;0;;0;Z;21;0;0;;;0 +4511;7;CDC;CAPITAL DECES;0;0;14;0;;0;;0;Z;24;0;0;;;0 +4512;99;FCC;FORFAIT CDC COTISANT TI;0;9;99;0;;0;;0;Z;21;0;14;;;0 +4513;99;FRI;FORFAIT CDC RETRAITE TI;0;9;99;0;;0;;0;Z;21;0;14;;;0 +4514;99;FPR;FORFAIT CDC POLY-RETRAITE;0;9;99;0;;0;;0;Z;21;0;14;;;0 +4515;99;AOT;ALLOCATION ORPHELIN D'UN TRAVAILLEUR INDEPENDANT DECEDE;0;9;99;0;;0;;0;Z;21;0;14;;;0 +4611;3;FDI;FORFAIT DIVERS PAYES A LA STRUCTURE DE SOINS (FILIERES ET RESEAUX);1;0;11;0;;0;;0;Z;21;0;0;;;0 +4612;3;FET;FORFAIT D EDUCATION THERAPEUTIQUE ET D INTERESSEMENT (FILIERES ET RESEAUX);1;0;11;0;;0;;0;Z;21;0;0;;;0 +5101;3;OP1;FORFAIT VERRES UNIFOCAUX OP1;0;0;6;0;;0;;1;Z;21;0;40;;;0 +5102;3;OP2;FORFAIT VERRES UNIFOCAUX OP2;0;0;6;0;;0;;1;Z;21;0;40;;;0 +5103;3;OP3;FORFAIT VERRES UNIFOCAUX OP3;0;0;6;0;;0;;1;Z;21;0;40;;;0 +5104;3;OP4;FORFAIT VERRES UNIFOCAUX OP4;0;0;6;0;;0;;1;Z;21;0;40;;;0 +5105;3;OP5;FORFAIT VERRES UNIFOCAUX OP5;0;0;6;0;;0;;1;Z;21;0;40;;;0 +5106;3;OP6;FORFAIT VERRES UNIFOCAUX OP6;0;0;6;0;;0;;1;Z;21;0;40;;;0 +5107;3;OPM;FORFAIT MONTURE CMU;0;0;6;0;;0;;1;Z;21;0;40;;;0 +5108;99;OME;TM DU FORFAIT MONTURE MOINS DE 18 ANS CMU;0;9;99;0;;0;;1;Z;;0;40;;;0 +5109;99;OV1;TM DU FORFAIT OPTIQUE -ENFANT-N° 1 UNIFOCAUX (CMU);0;9;99;0;;0;;1;Z;;0;40;;;0 +5110;99;OV2;TM DU FORFAIT OPTIQUE -ENFANT-N° 2 UNIFOCAUX (CMU);0;9;99;0;;0;;1;Z;;0;40;;;0 +5111;99;OV3;TM DU FORFAIT OPTIQUE -ENFANT-N° 3 UNIFOCAUX (CMU);0;9;99;0;;0;;1;Z;;0;40;;;0 +5112;99;OV4;TM DU FORFAIT OPTIQUE -ENFANT-N° 4 UNIFOCAUX (CMU);0;9;99;0;;0;;1;Z;;0;40;;;0 +5113;99;OV5;TM DU FORFAIT OPTIQUE -ENFANT-N° 5 UNIFOCAUX (CMU);0;9;99;0;;0;;1;Z;;0;40;;;0 +5114;99;OV6;TM DU FORFAIT OPTIQUE -ENFANT-N° 6 UNIFOCAUX (CMU);0;9;99;0;;0;;1;Z;;0;40;;;0 +5115;99;OV7;TM DU FORFAIT OPTIQUE -ENFANT-N° 7 UNIFOCAUX (CMU);0;9;99;0;;0;;1;Z;;0;40;;;0 +5116;99;OV8;TM DU FORFAIT OPTIQUE -ENFANT-N° 8 UNIFOCAUX (CMU);0;9;99;0;;0;;1;Z;;0;40;;;0 +5117;99;OV9;TM DU FORFAIT OPTIQUE -ENFANT-N° 9 UNIFOCAUX (CMU);0;9;99;0;;0;;1;Z;;0;40;;;0 +5118;99;OVA;TM DU FORFAIT OPTIQUE ENFANT -A- MULTIFOCAUX (CMU);0;9;99;0;;0;;1;Z;;0;40;;;0 +5119;99;OVB;TM DU FORFAIT OPTIQUE ENFANT -B- MULTIFOCAUX (CMU);0;9;99;0;;0;;1;Z;;0;40;;;0 +5120;99;OP7;TM DU FORFAIT OPTIQUE -N° 6 MULTIFOCAUX (CMU);0;9;99;0;;0;;1;Z;;0;40;;;0 +5201;3;FDA;FORFAIT PROTHESE DENTAIRE ADJOINTE;0;0;6;0;;1;;1;Z;21;0;27;;;0 +5202;3;FDR;FORFAIT REPARATION PROTHESE ADJOINTE;0;0;6;0;;1;;1;Z;21;0;27;;;0 +5203;3;FDC;FORFAIT PROTHESE DENTAIRE CONJOINTE;0;0;6;0;;1;;1;Z;21;0;27;;;0 +2205;6;;CCAA : CENTRE DE CURE AMBULATOIRE EN ALCOOLOGIE;0;0;10;0;;0;;0;Z;;0;0;;Pas d information sur la prestation;0 +2206;6;VIH;VIH;0;0;10;0;;0;;0;Z;;0;0;;;0 +2211;5;PJ;FRAIS DE SEJOUR;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2212;5;PMS;MAJORATION PMSI;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2213;5;PJE;FRAIS DE SEJOUR IME;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2214;99;SGJ;SUPPLEMENT DEROGATOIRE SG SUR PRESTATION SEJOUR NON REMBOURSABLE (CNMSS);0;9;10;0;;1;;0;Z;20;0;0;;;0 +2215;99;DSJ;SUPPLEMENT DEROGATOIRE SG SUR PRESTATION SEJOUR REMBOURSABLE (CNMSS);0;9;10;1;29;0;;0;Z;20;0;0;;;0 +2221;5;SHO;SUPPLEMENT CHAMBRE PARTICULIERE;0;0;10;1;13;0;;0;Z;22;0;0;;;0 +2222;5;SSM;SUPPLEMENT POUR SURVEILLANCE DU MALADE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2223;5;HNN;FRAIS D HOSPITALISATION DU NOUVEAU-NE DONNANT LIEU A FACTURATION EN SUPPLEMENT DE L HOSPITALISATION DE LA MERE EN MAISON DE REPOS;0;0;10;0;;0;;0;Z;;0;0;;fermé le 30/04/2003;0 +2224;5;SCH;SUPPLEMENT POUR CHAMBRE CHAUDE;0;0;10;1;13;0;;0;Z;;0;0;;fermé le 30/04/2003;0 +2225;5;SIN;SUPPLEMENT POUR INCUBATEUR;0;0;10;1;13;0;;0;Z;;0;0;;fermé le 30/04/2003;0 +2123;5;D03;ENTRAINEMENT A L HEMODIALYSE A DOMICILE ET A L AUTODIALYSE;0;0;10;0;;0;;0;Z;;0;0;;fermé le 29/02/2008;0 +2124;5;D04;ENTRAINEMENT A LA DPA;0;0;10;0;;0;;0;Z;;0;0;;fermé le 29/02/2008;0 +2125;5;D05;ENTRAINEMENT A LA DPCA;0;0;10;0;;0;;0;Z;;0;0;;fermé le 29/02/2008;0 +2126;5;D06;HEMODIALYSE A DOMICILE;0;0;10;0;;0;;0;Z;;0;0;;fermé le 29/02/2008;0 +2127;5;D07;DIALYSE PERITONEALE AUTOMATISEE (DPA);0;0;10;0;;0;;0;Z;;0;0;;fermé le 29/02/2008;0 +2128;5;D08;DIALYSE PERITONEALE CONTINUE AMBULATOIRE (DPCA);0;0;10;0;;0;;0;Z;;0;0;;fermé le 29/02/2008;0 +2129;5;D09;FORFAIT D'HEMODIALYSE EN CENTRE;0;0;10;0;;0;;0;Z;;0;0;;fermé le 01/03/2013;0 +2131;5;D10;FORFAIT D HEMODIALYSE EN CENTRE POUR ENFANT;0;0;10;0;;0;;0;Z;;0;0;;fermé le 01/03/2013;0 +2132;5;D11;FORFAIT D HEMODIALYSE EN UNITE DE DIALYSE MEDICALISEE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2133;5;DTP;DIALYSE TIERCE PERSONNE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2134;5;D12;FORFAIT D AUTODIALYSE SIMPLE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2135;5;D13;FORFAIT D AUTODIALYSE ASSISTEE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2136;5;D14;FORFAIT D HEMODIALYSE A DOMICILE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +2137;5;D15;FORFAIT DE DIALYSE PERITONEALE AUTOMATISEE (DPA);0;0;10;0;;0;;0;Z;22;0;0;;;0 +2138;5;D16;FORFAIT DE DIALYSE PERITONEALE CONTINUE AMBULATOIRE (DPCA);0;0;10;0;;0;;0;Z;22;0;0;;;0 +2139;5;D17;FORFAIT D ENTRAINEMENT A L HEMODIALYSE A DOMICILE ET AUTODIALYSE;0;0;10;0;;0;;0;Z;;0;0;;fermé le 01/03/2013;0 +2140;5;D18;FORFAIT D ENTRAINEMENT A LA DIALYSE PERITONEALE AUTOMATISEE;0;0;10;0;;0;;0;Z;;0;0;;fermé le 01/03/2013;0 +2141;0;CPC;FRAIS DE CHAMBRE PARTICULIERE POUR CONVENANCE PERSONNELLE;0;0;10;0;;0;;0;Z;22;0;0;;;0 +1617;1;RCC;REDEVANCE CHEF DE CLINIQUE;1;0;1;0;;0;;1;Z;20;0;47;;;0 +1618;1;CRC;COMPLEMENT DE REMUNERATION CHEF DE CLINIQUE;1;0;1;0;;0;;1;Z;20;0;47;;;0 +1619;1;CSS;CONTRAT SANTE SOLIDARITE;1;0;1;0;;0;;1;Z;20;0;47;;;0 +1621;99;CAT;CONTRAT D'AMELIORATION DE L'ORGANISATION DES SOINS - TRANSPORTS;1;9;99;0;;0;;0;Z;20;0;0;;;0 +1622;99;CAP;CONTRAT D'AMELIORATION DE L'ORGANISATION DES SOINS - PHARMACIE/LPP;1;9;99;0;;0;;0;Z;20;0;0;;;0 +1623;99;RCT;REVERSEMENTS CONTRAT D'AMELIORATION DE L'ORGANISATION DES SOINS TRANSPORTS;1;9;99;0;;0;;0;Z;20;0;0;;;0 +1624;99;RCL;REVERSEMENTS CONTRAT D'AMELIORATION DE L'ORGANISATION DES SOINS - PHARMACIE / LPP;1;9;99;0;;0;;0;Z;20;0;0;;;0 +1625;99;PCT;PENALITES CONTRAT D'AMELIORATION DE L'ORGANISATION DES SOINS - TRANSPORT;1;9;99;0;;0;;0;Z;20;0;0;;;0 +1626;99;PCL;PENALITES CONTRAT D'AMELIORATION DE L'ORGANISATION DES SOINS - PHARMACIE / LPP;1;9;99;0;;0;;0;Z;20;0;0;;;0 +1627;99;CIC;CONTRAT INCITATIF CHIRURGIEN-DENTISTE;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1628;99;ODG;DEMO. AIDE FORFAITAIRE GROUPE;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1629;99;ODP;DEMO. AIDE FORFAITAIRE POLE SANTE;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1630;99;DAG;DEMO. ACTIVITE GROUPE;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1631;99;DOP;DEMO. ACTIVITE POLE SANTE;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1632;99;STA;SST AIDE ACTIVITE;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1633;99;STD;SST FRAIS DEPLACEMENT;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1634;99;CDG;CONTESTATION POUR AIDE FORFAITAIRE POUR LES ADHERENTS A L'OPTION DEMOGRAPHIE DANS UN GROUPE;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1635;99;CDP;CONTESTATION POUR AIDE FORFAITAIRE POUR LES ADHERENTS A L'OPTION DEMOGRAPHIE DANS UN POLE DE SANTE;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1636;99;CAG;CONTESTATION POUR AIDE A L'ACTIVITE POUR LES ADHERENTS A L'OPTION DEMOGRAPHIE DANS UN GROUPE;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1637;99;CAS;CONTESTATION POUR AIDE A L'ACTIVITE POUR LES ADHERENTS A L'OPTION DEMOGRAPHIE DANS UN POLE DE SANTE;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1638;99;CTA;CONTESTATION POUR AIDE A L'ACTIVITE POUR LES ADHERENTS A L'OPTION SST;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1639;99;CTD;CONTESTATION POUR FRAIS DE DEPLACEMENT POUR LES ADHERENTS A L'OPTION SST;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1640;99;CIM;FORFAIT AIDE A L'INSTALLATION DU MEDECIN - CAIM;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1641;99;MAI;MAJORATION AIDE À L'INSTALLATION DU MEDECIN - CAIM;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1642;99;EHP;MAJORATION POUR EXERCICE PARTIEL EN HÔPITAL DE PROXIMITÉ - CAIM;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1643;99;PAA;PAIEMENT AIDE ACTIVITE - CSTM;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1644;99;MAM;MAJO REMUNERATION ARS - CSTM;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1645;99;DEP;PAIEMENT PRISE EN CHARGE FRAIS DEPLACEMENT - CSTM;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1646;99;AIF;AIDE FORFAITAIRE - COSCOM;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1647;99;RMS;REMUNERATION COMPLEMENTAIRE ACCUEIL DE STAGIAIRE - COSCOM;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1648;99;RHP;REMUNERATION COMPLEMENTAIRE EXERCICE EN HOPITAL - COSCOM;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1649;99;MAO;PAIEMENT MAJORATION DE LA REMUNERATION ARS - COSCOM;1;9;99;0;;0;;1;Z;21;0;47;;;0 +1431;1;D/OCC;ACTES EN D (ET OCC POUR LA CRPCEN);1;0;2;0;;1;1;1;Z;;0;2;;fermé le 01/12/2014;0 +1432;1;DC;ACTES EN DC;1;0;2;0;;1;1;1;Z;20;0;37;;;0 +1433;1;SC/SCA;ACTES EN SC (ET SCA POUR LA CRPCEN);1;0;2;0;;1;1;1;Z;;0;34;;SC: fermé le 01/12/2014;0 +1434;99;BDC;PREVENTION BUCCO-DENTAIRE: CONSULTATION - MATER;1;9;99;0;;1;;1;Z;20;0;20;;;0 +1435;99;BR2;PREVENTION BUCCO-DENTAIRE: RADIO DEUX CLICHES-MATER;1;9;99;0;;1;;1;Z;20;0;20;;;0 +1436;99;BR4;PREVENTION BUCCO-DENTAIRE: RADIO QUATRE CLICHES-MATER;1;9;99;0;;1;;1;Z;20;0;20;;;0 +1437;99;MCD;Majoration spécifique PDS Clinique Dentiste;1;9;99;1;23;0;;1;Z;20;0;2;;;0 +1451;99;SDE;SOINS DENTAIRES;1;9;99;0;;1;;1;T;20;1;41;1;;0 +1452;99;PAR;PROTHESE AMOVIBLE DEFINITIVE RESINE;1;9;99;0;;1;;1;T;20;1;42;;;0 +1453;99;AXI;PROPHYLAXIE BUCCO DENTAIRE CCAM;1;9;99;0;;1;;1;T;20;1;41;1;;0 +1461;1;ADP;ACTES DIVERS PROTHESE DENTAIRE CCAM;1;0;2;0;;0;;0;Z;;1;0;;;0 +1462;99;PFM;PROTHESE FIXE METALLIQUE;1;9;99;0;;1;;1;T;20;1;42;;;0 +1463;1;PFE;PROTHESE DENTAIRE FIXE ESTHETIQUE CCAM;1;0;2;0;;0;;0;Z;;1;0;;;0 +1464;1;PDA;PROTHESE DENTAIRE AMOVIBLE CCAM;1;0;2;0;;0;;0;Z;;1;0;;;0 +1465;99;IMP;IMPLANTOLOGIE - CCAM;1;9;99;0;;1;;1;T;20;1;25;1;;0 +1466;1;TOR;TRAITEMENT ORTHOPEDIE DENTO FACIALE CCAM;1;0;2;0;;0;;0;Z;;1;0;;;0 +1470;99;END;ENDODONTIE;1;9;99;0;;1;;1;T;20;1;41;1;;0 +1471;99;INO;INLAY-ONLAY;1;9;99;0;;1;;1;T;20;1;41;1;;0 +1472;99;TDS;PARODONTOLOGIE;1;9;99;0;;1;;1;T;20;1;25;1;;0 +1473;99;ICO;INLAY-CORE;1;9;99;0;;1;;1;T;20;1;42;;;0 +1474;99;PAM;PROTHESE AMOVIBLE DEFINITIVE METALLIQUE;1;9;99;0;;1;;1;T;20;1;42;;;0 +1475;99;PDT;PROTHESE DENTAIRE PROVISOIRE;1;9;99;0;;1;;1;T;20;1;42;;;0 +1476;99;PFC;PROTHESE FIXE CERAMIQUE;1;9;99;0;;1;;1;T;20;1;42;;;0 +1477;99;RPN;REPARATION SUR PROTHESE;1;9;99;0;;1;;1;T;20;1;42;;;0 +1511;1;A;FORFAIT D ACCOUCHEMENT SIMPLE DES SAGES-FEMMES (FORFAIT N91);1;0;1;0;;0;;1;Z;;0;1;;fermé le 21/11/2004;0 +1512;1;AM;FORFAIT D ACCOUCHEMENT MULTIPLE DES SAGES-FEMMES (FORFAIT N92);1;0;1;0;;0;;1;Z;;0;1;;;0 +1521;1;MG;MAJORATION POUR GARDE;1;0;1;1;7;0;;1;Z;20;0;1;;;0 +1522;1;MA;MAJORATION ASTREINTE;1;0;1;1;7;0;;1;Z;20;0;1;;;0 +1523;99;PRC;Permanence Rémunération demi-journée Chirurgien-dentiste;1;9;99;0;;0;;1;Z;20;0;47;;;0 +1601;1;CPU;CONTRAT PRATIQUE VERSEMENT UNIQUEMENT;1;0;1;0;;0;;1;Z;20;0;47;;;0 +1602;1;CBP;FORFAIT CONTRAT DE BONNES PRATIQUES;1;2;;0;;0;;1;Z;20;0;47;;;0 +1603;1;CP1;FORFAIT CONTRAT DE BONNES PRATIQUES CP1;1;0;7;0;;0;;1;Z;20;0;50;;fermé le 31/12/2007;0 +1604;1;CP2;FORFAIT CONTRAT DE BONNES PRATIQUES CP2;1;0;7;0;;0;;1;Z;20;0;50;;fermé le 31/12/2008;0 +1605;1;CP3;FORFAIT CONTRAT DE BONNES PRATIQUES CP3;1;0;7;0;;0;;1;Z;20;0;50;;;0 +1606;1;CPL;FORFAIT CONTRAT DE BONNES PRATIQUES;1;0;4;0;;0;;1;Z;20;0;47;;;0 +1607;99;CSI;FORFAIT CONTRAT DE SANTE PUBLIQUE INFIRMIER;1;9;99;0;;0;;1;Z;20;0;46;;fermé le01/01/2010;0 +1608;1;CSL;FORFAIT CONTRAT DE SANTE PUBLIQUE BIOLOGISTES;1;0;4;0;;0;;1;Z;20;0;46;;;0 +1609;1;CBR;CONTRAT DE BONNES PRATIQUES ZONE RURAL;1;0;1;0;;0;;1;Z;20;0;47;;;0 +1610;99;CAD;RÉMUNÉRATION CONTRAT D'ACCÈS AUX SOINS DENTAIRES;1;9;99;0;;0;;0;Z;20;0;0;;;0 +1611;1;CBM;CONTRAT DE BONNES PRATIQUES ZONE MONTAGNE;1;0;1;0;;0;;1;Z;20;0;47;;;0 +1612;1;CBU;CONTRAT DE BONNES PRATIQUES ZONE URBAINE;1;0;1;0;;0;;1;Z;20;0;47;;;0 +1613;1;ZFU;CONTRAT DE BONNES PRATIQUES ZONE FRANCHE URBAINE;1;0;1;0;;0;;1;Z;20;0;47;;;0 +1614;1;CP6;CONTRAT DE BONNES PRATIQUES TRANSPORTEURS 2006;1;0;1;0;;0;;1;Z;20;0;50;;;0 +1615;1;CP7;CONTRAT DE BONNES PRATIQUES TRANSPORTEURS 2007;1;0;1;0;;0;;1;Z;20;0;50;;;0 +1616;1;CP8;CONTRAT DE BONNES PRATIQUES TRANSPORTEURS 2008;1;0;1;0;;0;;1;Z;20;0;50;;;0 diff --git a/src/test/resources/value_tables/IR_NAT_V.parquet b/src/test/resources/value_tables/IR_NAT_V.parquet new file mode 100644 index 00000000..03d41591 Binary files /dev/null and b/src/test/resources/value_tables/IR_NAT_V.parquet differ diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/ClassificationSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/ClassificationSuite.scala index e716c9db..0bc1b675 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/ClassificationSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/ClassificationSuite.scala @@ -24,28 +24,4 @@ class ClassificationSuite extends AnyFlatSpecLike { // Then assert(result == expected) } - - "fromRow" should "convert the row accordingly" in { - // Given - val schema = StructType( - StructField("patientID", StringType) :: - StructField("groupID", StringType) :: - StructField("name", StringType) :: - StructField("eventDate", TimestampType) :: Nil - ) - - val values = Array[Any]("Stevie", "42", "GHMDA233", makeTS(2016, 1, 1)) - - val row = new GenericRowWithSchema(values, schema) - - val expected = Event("Stevie", "ghm", "42", "GHMDA233", 0.0, makeTS(2016, 1, 1), None) - - // When - val result = GHMClassification.fromRow(row) - - // Then - assert(result == expected) - } - - } diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/DiagnosisSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/DiagnosisSuite.scala index 166c6ebf..1f0fd50d 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/DiagnosisSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/DiagnosisSuite.scala @@ -29,44 +29,4 @@ class DiagnosisSuite extends AnyFlatSpec { // Then assert(result == expected) } - - "fromRow" should "allow creation of a DiagnosisBuilder event from a row object" in { - - // Given - val schema = StructType( - StructField("pID", StringType) :: - StructField("gId", StringType) :: - StructField("cod", StringType) :: - StructField("wei", StringType) :: - StructField("dat", TimestampType) :: Nil - ) - val values = Array[Any]("Patient01", "1_1_2010", "C67", 1.0, makeTS(2010, 1, 1)) - val r = new GenericRowWithSchema(values, schema) - val expected = MockDiagnosis("Patient01", "1_1_2010", "C67", 1.0, makeTS(2010, 1, 1)) - - // When - val result = MockDiagnosis.fromRow(r, "pID", "gId", "cod", "wei", "dat") - - // Then - assert(result == expected) - } - - it should "support creation without groupId" in { - - // Given - val schema = StructType( - StructField("pID", StringType) :: - StructField("cod", StringType) :: - StructField("dat", TimestampType) :: Nil - ) - val values = Array[Any]("Patient01", "C67", makeTS(2010, 1, 1)) - val r = new GenericRowWithSchema(values, schema) - val expected = MockDiagnosis("Patient01", "C67", makeTS(2010, 1, 1)) - - // When - val result = MockDiagnosis.fromRow(r, "pID", "cod", "dat") - - // Then - assert(result == expected) - } } diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/DrugSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/DrugSuite.scala index d1bfe88c..02a8e545 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/DrugSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/DrugSuite.scala @@ -19,30 +19,11 @@ class DrugSuite extends SharedContext { ) // When - val result = Drug(patientID, "Drug1", 0.1, timestamp) + val result = Drug(patientID, "Drug1", 0.1, "NA", timestamp) // Then assert(result == expected) } - "fromRow" should "create extract corresponding column values and create Drug event correctly" in { - - // Given - val sqlCtx = sqlContext - import sqlCtx.implicits._ - import util.functions.makeTS - - val inputDF = Seq( - ("patientId", "drugName", 0.1, makeTS(2014, 5, 5)) - ).toDF("pId", "dname", "weigh", "eventDate") - - val expected = Drug("patientId", "drugName", 0.1, makeTS(2014, 5, 5)) - - // When - val result = Drug.fromRow(inputDF.first, "pId", "dname", "weigh") - - // Then - assert(result == expected) - } } diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/ExposureSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/ExposureSuite.scala index ddaec829..ba14ef27 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/ExposureSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/ExposureSuite.scala @@ -23,24 +23,4 @@ class ExposureSuite extends AnyFlatSpec { // Then assert(result == expected) } - - "fromRow" should "allow creation of a Exposure event from a row object" in { - // Given - val schema = StructType( - StructField("pID", StringType) :: - StructField("mol", StringType) :: - StructField("weight", DoubleType) :: - StructField("start", TimestampType) :: - StructField("end", TimestampType) :: Nil - ) - val values = Array[Any]("Patient01", "pioglitazone", 100.0, makeTS(2010, 1, 1), makeTS(2010, 2, 1)) - val r = new GenericRowWithSchema(values, schema) - val expected = Exposure("Patient01", "pioglitazone", 100.0, makeTS(2010, 1, 1), makeTS(2010, 2, 1)) - - // When - val result = Exposure.fromRow(r, "pID", "mol", "weight", "start", "end") - - // Then - assert(result == expected) - } } diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/FollowUpSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/FollowUpSuite.scala index a375f656..4538461d 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/FollowUpSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/FollowUpSuite.scala @@ -23,23 +23,4 @@ class FollowUpSuite extends AnyFlatSpec { // Then assert(result == expected) } - - "fromRow" should "allow creation of a FollowUp event from a row object" in { - // Given - val schema = StructType( - StructField("pID", StringType) :: - StructField("endR", StringType) :: - StructField("start", TimestampType) :: - StructField("end", TimestampType) :: Nil - ) - val values = Array[Any]("Patient01", "any_reason", makeTS(2010, 1, 1), makeTS(2010, 2, 1)) - val r = new GenericRowWithSchema(values, schema) - val expected = FollowUp("Patient01", "any_reason", makeTS(2010, 1, 1), makeTS(2010, 2, 1)) - - // When - val result = FollowUp.fromRow(r, "pID", "endR", "start", "end") - - // Then - assert(result == expected) - } } diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/HospitalStaySuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/HospitalStaySuite.scala index 036747e5..05976c5b 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/HospitalStaySuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/HospitalStaySuite.scala @@ -23,27 +23,4 @@ class HospitalStaySuite extends SharedContext { assert(expected == result) } - - "fromRow" should "create hospital stay event correctly from dataframe row" in { - // Given - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - val df = Seq( - ("patientID", "hospitalID", makeTS(2018, 1, 1), makeTS(2018, 3, 1)) - ).toDF("patientID", "value", "start", "end") - - val expected = HospitalStay( - "patientID", "hospitalID", - makeTS(2018, 1, 1), makeTS(2018, 3, 1) - ) - - //When - val result = HospitalStay.fromRow(df.first) - - //Then - assert(expected == result) - - } - } diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/MedicalActSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/MedicalActSuite.scala index 849023fe..fdbf650b 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/MedicalActSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/MedicalActSuite.scala @@ -18,7 +18,7 @@ class MedicalActSuite extends AnyFlatSpec { val category: EventCategory[MedicalAct] = "mock_act" } - "apply" should "allow creation of a DiagnosisBuilder event" in { + "apply" should "allow creation of a Medical Act event" in { // Given val expected = Event[MedicalAct](patientID, MockMedicalAct.category, "hosp", "C67", 0.0, timestamp, None) @@ -29,25 +29,4 @@ class MedicalActSuite extends AnyFlatSpec { // Then assert(result == expected) } - - "fromRow" should "allow creation of a DiagnosisBuilder event from a row object" in { - - // Given - val schema = StructType( - StructField("pID", StringType) :: - StructField("gId", StringType) :: - StructField("cod", StringType) :: - StructField("wei", StringType) :: - StructField("dat", TimestampType) :: Nil - ) - val values = Array[Any]("Patient01", "1_1_2010", "C67", 0.0, makeTS(2010, 1, 1)) - val r = new GenericRowWithSchema(values, schema) - val expected = MockMedicalAct("Patient01", "1_1_2010", "C67", 0.0, makeTS(2010, 1, 1)) - - // When - val result = MockMedicalAct.fromRow(r, "pID", "gId", "cod", "wei", "dat") - - // Then - assert(result == expected) - } } diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/MedicalTakeOverReasonSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/MedicalTakeOverReasonSuite.scala index 3705a58a..e15b5a7a 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/MedicalTakeOverReasonSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/MedicalTakeOverReasonSuite.scala @@ -28,25 +28,4 @@ class MedicalTakeOverReasonSuite extends AnyFlatSpec { // Then assert(result == expected) } - - "fromRow" should "allow creation of a MedicalTakeOverReason Builder event from a row object" in { - - // Given - val schema = StructType( - StructField("pID", StringType) :: - StructField("gId", StringType) :: - StructField("cod", StringType) :: - StructField("wei", StringType) :: - StructField("dat", TimestampType) :: Nil - ) - val values = Array[Any]("Patient01", "1_1_2010", "11", 0.0, makeTS(2010, 1, 1)) - val r = new GenericRowWithSchema(values, schema) - val expected = MockMedicalTakeOverReason("Patient01", "1_1_2010", "11", 0.0, makeTS(2010, 1, 1)) - - // When - val result = MockMedicalTakeOverReason.fromRow(r, "pID", "gId", "cod", "wei", "dat") - - // Then - assert(result == expected) - } } diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/MoleculeSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/MoleculeSuite.scala index 2c5ba020..be5b18bd 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/MoleculeSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/MoleculeSuite.scala @@ -24,23 +24,4 @@ class MoleculeSuite extends AnyFlatSpec { // Then assert(result == expected) } - - "fromRow" should "allow creation of a Molecule event from a row object" in { - // Given - val schema = StructType( - StructField("pID", StringType) :: - StructField("mol", StringType) :: - StructField("weight", DoubleType) :: - StructField("date", TimestampType) :: Nil - ) - val values = Array[Any]("Patient01", "pioglitazone", 100.0, makeTS(2010, 1, 1)) - val r = new GenericRowWithSchema(values, schema) - val expected = Molecule("Patient01", "pioglitazone", 100.0, makeTS(2010, 1, 1)) - - // When - val result = Molecule.fromRow(r, "pID", "mol", "weight", "date") - - // Then - assert(result == expected) - } } diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/NgapActSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/NgapActSuite.scala new file mode 100644 index 00000000..5ccf19f2 --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/NgapActSuite.scala @@ -0,0 +1,27 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.events + +import java.sql.Timestamp +import org.mockito.Mockito.mock +import org.scalatest.flatspec.AnyFlatSpec + +class NgapActSuite extends AnyFlatSpec { + + object MockNgapAct extends NgapAct + + val patientID: String = "patientID" + val timestamp: Timestamp = mock(classOf[Timestamp]) + + "apply" should "allow creation of a NgapActBuilder event" in { + + // Given + val expected = Event[NgapAct](patientID, MockNgapAct.category, "A10000001", "9.5", 0.0, timestamp, None) + + // When + val result = MockNgapAct(patientID, "A10000001","9.5", timestamp) + + // Then + assert(result == expected) + } +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/PractionnerClaimSpecialitySuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/PractionnerClaimSpecialitySuite.scala index bf2ccf5b..5f45e625 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/PractionnerClaimSpecialitySuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/events/PractionnerClaimSpecialitySuite.scala @@ -14,7 +14,7 @@ class PractitionerClaimSpecialitySuite extends AnyFlatSpec { val patientID: String = "patientID" val timestamp: Timestamp = mock(classOf[Timestamp]) - object MockPractionnerClaimSpeciality$ extends PractitionerClaimSpeciality { + object MockPractionnerClaimSpeciality extends PractitionerClaimSpeciality { val category: EventCategory[PractitionerClaimSpeciality] = "mock_prestationSpeciality" } @@ -23,7 +23,7 @@ class PractitionerClaimSpecialitySuite extends AnyFlatSpec { // Given val expected = Event[PractitionerClaimSpeciality]( patientID, - MockPractionnerClaimSpeciality$.category, + MockPractionnerClaimSpeciality.category, "A10000001", "42", 0.0, @@ -32,27 +32,7 @@ class PractitionerClaimSpecialitySuite extends AnyFlatSpec { ) // When - val result = MockPractionnerClaimSpeciality$(patientID, "A10000001", "42", timestamp) - - // Then - assert(result == expected) - } - - "fromRow" should "allow creation of a PrestationSpecialityBuilder event from a row object" in { - - // Given - val schema = StructType( - StructField("pID", StringType) :: - StructField("gId", StringType) :: - StructField("cod", StringType) :: - StructField("dat", TimestampType) :: Nil - ) - val values = Array[Any]("Patient01", "A10000001", "42", makeTS(2010, 1, 1)) - val r = new GenericRowWithSchema(values, schema) - val expected = MockPractionnerClaimSpeciality$("Patient01", "A10000001", "42", makeTS(2010, 1, 1)) - - // When - val result = MockPractionnerClaimSpeciality$.fromRow(r, "pID", "gId", "cod", "dat") + val result = MockPractionnerClaimSpeciality(patientID, "A10000001", "42", timestamp) // Then assert(result == expected) diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/EventRowExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/EventRowExtractorSuite.scala index 5471a21e..9ffe7fb8 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/EventRowExtractorSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/EventRowExtractorSuite.scala @@ -14,7 +14,7 @@ class EventRowExtractorSuite extends SharedContext { def extractStart(r: Row) = new Timestamp(0) } - "extractGroupId" should "return the group ID for imb (always 'imb')" in { + "extractGroupId" should "return NA" in { // Given val expected = "NA" @@ -26,8 +26,20 @@ class EventRowExtractorSuite extends SharedContext { assert(result == expected) } + "extractValue" should "return NA" in { - "weight" should "return the weight value" in { + // Given + val expected = "NA" + + // When + val result = MockRowExtractor.extractGroupId(Row()) + + // Then + assert(result == expected) + } + + + "weight" should "return 0.0" in { // Given val expected = 0.0 @@ -40,7 +52,7 @@ class EventRowExtractorSuite extends SharedContext { } - "end" should "compute the end date of the event" in { + "end" should "return None" in { // Given val expected: Option[Timestamp] = None diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/PrescriptionExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/PrescriptionExtractorSuite.scala deleted file mode 100644 index 50af9060..00000000 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/PrescriptionExtractorSuite.scala +++ /dev/null @@ -1,51 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors - -import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema -import org.apache.spark.sql.types.{StringType, StructField, StructType} -import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.events.{AnyEvent, EventBuilder, EventCategory} -import fr.polytechnique.cmap.cnam.etl.extractors.dcir.DcirExtractor - - -trait MockEvent extends AnyEvent with EventBuilder - -object MockEventobject extends MockEvent { - override val category: EventCategory[AnyEvent] = "NA" -} - -class PrescriptionExtractorSuite extends SharedContext { - - trait DcirMockExtractor extends DcirExtractor[MockEvent] - - object MockPrescriptionExtractor extends DcirMockExtractor { - override val columnName: String = "" - override val eventBuilder: EventBuilder = MockEventobject - } - - "extractGroupId" should "return the group ID for done values" in { - // Given - val schema = StructType( - Seq( - StructField("FLX_DIS_DTD", StringType), - StructField("FLX_TRT_DTD", StringType), - StructField("FLX_EMT_TYP", StringType), - StructField("FLX_EMT_NUM", StringType), - StructField("FLX_EMT_ORD", StringType), - StructField("ORG_CLE_NUM", StringType), - StructField("DCT_ORD_NUM", StringType) - ) - ) - - val values = Array[Any]("2014-08-01", "2014-07-17", "1", "17", "0", "01C673000", "1749") - val r = new GenericRowWithSchema(values, schema) - val expected = "MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMwMDBfMTc0OQ==" - - // When - val result = MockPrescriptionExtractor.extractGroupId(r) - - // Then - assert(result == expected) - } -} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/DcirBiologyActsSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/DcirBiologyActsSuite.scala deleted file mode 100644 index 1ba30c1b..00000000 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/DcirBiologyActsSuite.scala +++ /dev/null @@ -1,271 +0,0 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.acts - -import scala.util.Success - -import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.events.{BiologyDcirAct, DcirAct, Event, MedicalAct} -import fr.polytechnique.cmap.cnam.etl.sources.Sources -import fr.polytechnique.cmap.cnam.util.functions._ -import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema -import org.apache.spark.sql.types._ -import org.scalatest.matchers.should.Matchers.{an, convertToAnyShouldWrapper} -import org.scalatest.TryValues._ - - -class DcirBiologyActsSuite extends SharedContext { - - import DcirBiologyActExtractor.ColNames - - val schema = StructType( - StructField(ColNames.PatientID, StringType) :: - StructField(ColNames.BioCode, StringType) :: - StructField(ColNames.InstitutionCode, DoubleType) :: - StructField(ColNames.GHSCode, DoubleType) :: - StructField(ColNames.Sector, DoubleType) :: - StructField(ColNames.Date, DateType) :: Nil - ) - - val oldSchema = StructType( - StructField(ColNames.PatientID, StringType) :: - StructField(ColNames.BioCode, StringType) :: - StructField(ColNames.Date, DateType) :: Nil - ) - - "isInStudy" should "return true when a study code is found in the row" in { - - // Given - val codes = Set("AAAA", "BBBB") - val inputArray = Array[Any]("Patient_A", "AAAA", null, null, null, makeTS(2010, 1, 1)) - val inputRow = new GenericRowWithSchema(inputArray, schema) - - // When - val result = DcirBiologyActExtractor.isInStudy(codes)(inputRow) - - // Then - assert(result) - } - - it should "return false when no code is found in the row" in { - - // Given - val codes = Set("AAAA", "BBBB") - val inputArray = Array[Any]("Patient_A", "CCCC", 1D, 0D, 1D, makeTS(2010, 1, 1)) - val inputRow = new GenericRowWithSchema(inputArray, schema) - - // When - val result = DcirBiologyActExtractor.isInStudy(codes)(inputRow) - - // Then - assert(!result) - } - - "builder" should "return a DCIR act if the event is in a older version of DCIR" in { - // Given - val inputArray = Array[Any]("Patient_A", "AAAA", makeTS(2010, 1, 1)) - val inputRow = new GenericRowWithSchema(inputArray, oldSchema) - val expected = Seq(BiologyDcirAct("Patient_A", BiologyDcirAct.groupID.DcirAct, "AAAA", 1.0, makeTS(2010, 1, 1))) - - // When - val result = DcirBiologyActExtractor.builder(inputRow) - - // Then - assert(result == expected) - } - - "getGHS" should "return the value in the correct column" in { - // Given - val schema = StructType(StructField(ColNames.GHSCode, DoubleType) :: Nil) - val inputArray = Array[Any](3D) - val input = new GenericRowWithSchema(inputArray, schema) - val expected = 3D - - // When - val result = DcirBiologyActExtractor.getGHS(input) - - // Then - assert(result == expected) - } - - "getSector" should "return the expected value" in { - // Given - val schema = StructType(StructField(ColNames.Sector, DoubleType) :: Nil) - val inputArray = Array[Any](3D) - val input = new GenericRowWithSchema(inputArray, schema) - val expected = 3D - - // When - val result = DcirBiologyActExtractor.getSector(input) - - // Then - assert(result == expected) - } - - "getInstitutionCode" should "return the value in the correct column" in { - // Given - val schema = StructType(StructField(ColNames.InstitutionCode, DoubleType) :: Nil) - val inputArray = Array[Any](52D) - val input = new GenericRowWithSchema(inputArray, schema) - val expected = 52D - - // When - val result = DcirBiologyActExtractor.getInstitutionCode(input) - - // Then - assert(result == expected) - - } - - "getGroupID" should "return correct status of private ambulatory" in { - // Given - val schema = StructType( - StructField(ColNames.GHSCode, DoubleType) :: - StructField(ColNames.Sector, StringType) :: - StructField(ColNames.InstitutionCode, DoubleType) :: Nil - ) - val array = Array[Any](0D, 2D, 6D) - val input = new GenericRowWithSchema(array, schema) - val expected = Success(DcirAct.groupID.PrivateAmbulatory) - - // When - val result = DcirBiologyActExtractor.getGroupId(input) - - // Then - assert(result == expected) - - } - - it should "return Success(PublicAmbulatory) if it is public related" in { - // Given - val schema = StructType(StructField(ColNames.Sector, StringType) :: Nil) - val array = Array[Any](1D) - val input = new GenericRowWithSchema(array, schema) - // When - val result = DcirBiologyActExtractor.getGroupId(input) - - // Then - result.success.value shouldBe DcirAct.groupID.PublicAmbulatory - } - - it should "return Success(Liberal) if it is liberal act" in { - // Given - val schema = StructType( - StructField(ColNames.Sector, StringType) :: StructField( - ColNames.GHSCode, - StringType - ) :: Nil - ) - val array = Array[Any](null, null) - val input = new GenericRowWithSchema(array, schema) - // When - val result = DcirBiologyActExtractor.getGroupId(input) - - // Then - result.success.value shouldBe DcirAct.groupID.Liberal - } - - it should "return Success(PrivateAmbulatory) if it is private ambulatory act" in { - // Given - val schema = StructType( - StructField(ColNames.Sector, StringType) :: StructField( - ColNames.GHSCode, - DoubleType - ) :: StructField(ColNames.InstitutionCode, DoubleType) :: Nil - ) - val array = Array[Any](null, 0D, 4D) - val input = new GenericRowWithSchema(array, schema) - // When - val result = DcirBiologyActExtractor.getGroupId(input) - - // Then - result.success.value shouldBe DcirAct.groupID.PrivateAmbulatory - } - - it should "return Success(UnkownSource) if it is an act with unknown source" in { - // Given - val schema = StructType( - StructField(ColNames.Sector, StringType) :: StructField( - ColNames.GHSCode, - DoubleType - ) :: StructField(ColNames.InstitutionCode, DoubleType) :: Nil - ) - val array = Array[Any](null, 1D, 4D) - val input = new GenericRowWithSchema(array, schema) - // When - val result = DcirBiologyActExtractor.getGroupId(input) - - // Then - result.success.value shouldBe DcirAct.groupID.Unknown - } - - it should "return IllegalArgumentException if the information of source of act is unavailable in DCIR" in { - // Given - val schema = StructType( - StructField(ColNames.GHSCode, DoubleType) :: - StructField(ColNames.Sector, StringType) :: Nil - ) - val array = Array[Any](0D, 2D, 6D) - val input = new GenericRowWithSchema(array, schema) - // When - val result = DcirBiologyActExtractor.getGroupId(input) - - // Then - result.failure.exception shouldBe an[IllegalArgumentException] - } - - "extract" should "return a Dataset of DCIR Biology Acts" in { - - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - // Given - val codes = Set("AAAA", "CCCC") - - val input = Seq( - ("Patient_A", "AAAA", "CCAM1", Some(makeTS(2010, 1, 1)), None, None, None, makeTS(2010, 1, 1)), - ("Patient_A", "BBBB", "CCAM1", Some(makeTS(2010, 2, 1)), Some(1D), Some(0D), Some(1D), makeTS(2010, 2, 1)), - ("Patient_B", "CCCC", "CCAM1", Some(makeTS(2010, 3, 1)), None, None, None, makeTS(2010, 3, 1)), - ("Patient_B", "CCCC", "CCAM1", Some(makeTS(2010, 4, 1)), Some(7D), Some(0D), Some(2D), makeTS(2010, 4, 1)), - ("Patient_C", "BBBB", "CCAM1", None, Some(1D), Some(0D), Some(2D), makeTS(2010, 5, 1)) - ).toDF( - ColNames.PatientID, ColNames.BioCode, ColNames.CamCode, ColNames.Date, - ColNames.InstitutionCode, ColNames.GHSCode, ColNames.Sector, ColNames.DcirFluxDate - ) - - val sources = Sources(dcir = Some(input)) - - val expected = Seq[Event[MedicalAct]]( - BiologyDcirAct("Patient_A", BiologyDcirAct.groupID.Liberal, "AAAA", 1.0, makeTS(2010, 1, 1)), - BiologyDcirAct("Patient_B", BiologyDcirAct.groupID.Liberal, "CCCC", 1.0, makeTS(2010, 3, 1)), - BiologyDcirAct("Patient_B", BiologyDcirAct.groupID.PrivateAmbulatory, "CCCC", 1.0, makeTS(2010, 4, 1)) - ).toDS - - // When - val result = DcirBiologyActExtractor.extract(sources, codes) - - // Then - assertDSs(result, expected) - } - - "extract" should "return a Dataset of DCIR Biology Acts from raw data" in { - - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - // Given - val codes = Set("238") - - val input = sqlCtx.read.parquet("src/test/resources/test-input/DCIR.parquet") - - val sources = Sources(dcir = Some(input)) - - val expected = Seq[Event[MedicalAct]]( - BiologyDcirAct("Patient_01", BiologyDcirAct.groupID.Liberal, "238", 1.0, makeTS(2006, 1, 15)) - ).toDS - - // When - val result = DcirBiologyActExtractor.extract(sources, codes) - - // Then - assertDSs(result, expected) - } -} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/DcirMedicalActsSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/DcirMedicalActsSuite.scala deleted file mode 100644 index d2e1c8bf..00000000 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/DcirMedicalActsSuite.scala +++ /dev/null @@ -1,248 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors.acts - -import scala.util.Success -import org.scalatest.matchers.should.Matchers.{an, convertToAnyShouldWrapper} -import org.scalatest.TryValues._ -import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema -import org.apache.spark.sql.types._ -import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.events.{DcirAct, Event, MedicalAct} -import fr.polytechnique.cmap.cnam.etl.sources.Sources -import fr.polytechnique.cmap.cnam.util.functions._ - -class DcirMedicalActsSuite extends SharedContext { - - import DcirMedicalActExtractor.ColNames - - val schema = StructType( - StructField(ColNames.PatientID, StringType) :: - StructField(ColNames.CamCode, StringType) :: - StructField(ColNames.InstitutionCode, DoubleType) :: - StructField(ColNames.GHSCode, DoubleType) :: - StructField(ColNames.Sector, DoubleType) :: - StructField(ColNames.Date, DateType) :: Nil - ) - - val oldSchema = StructType( - StructField(ColNames.PatientID, StringType) :: - StructField(ColNames.CamCode, StringType) :: - StructField(ColNames.Date, DateType) :: Nil - ) - - "isInStudy" should "return true when a study code is found in the row" in { - - // Given - val codes = Set("AAAA", "BBBB") - val inputArray = Array[Any]("Patient_A", "AAAA", null, null, null, makeTS(2010, 1, 1)) - val inputRow = new GenericRowWithSchema(inputArray, schema) - - // When - val result = DcirMedicalActExtractor.isInStudy(codes)(inputRow) - - // Then - assert(result) - } - - it should "return false when no code is found in the row" in { - - // Given - val codes = Set("AAAA", "BBBB") - val inputArray = Array[Any]("Patient_A", "CCCC", 1D, 0D, 1D, makeTS(2010, 1, 1)) - val inputRow = new GenericRowWithSchema(inputArray, schema) - - // When - val result = DcirMedicalActExtractor.isInStudy(codes)(inputRow) - - // Then - assert(!result) - } - - "builder" should "return a DCIR act if the event is in a older version of DCIR" in { - // Given - val inputArray = Array[Any]("Patient_A", "AAAA", makeTS(2010, 1, 1)) - val inputRow = new GenericRowWithSchema(inputArray, oldSchema) - val expected = Seq(DcirAct("Patient_A", DcirAct.groupID.DcirAct, "AAAA", 1.0, makeTS(2010, 1, 1))) - - // When - val result = DcirMedicalActExtractor.builder(inputRow) - - // Then - assert(result == expected) - } - - "getGHS" should "return the value in the correct column" in { - // Given - val schema = StructType(StructField(ColNames.GHSCode, DoubleType) :: Nil) - val inputArray = Array[Any](3D) - val input = new GenericRowWithSchema(inputArray, schema) - val expected = 3D - - // When - val result = DcirMedicalActExtractor.getGHS(input) - - // Then - assert(result == expected) - } - - "getSector" should "return the expected value" in { - // Given - val schema = StructType(StructField(ColNames.Sector, DoubleType) :: Nil) - val inputArray = Array[Any](3D) - val input = new GenericRowWithSchema(inputArray, schema) - val expected = 3D - - // When - val result = DcirMedicalActExtractor.getSector(input) - - // Then - assert(result == expected) - } - - "getInstitutionCode" should "return the value in the correct column" in { - // Given - val schema = StructType(StructField(ColNames.InstitutionCode, DoubleType) :: Nil) - val inputArray = Array[Any](52D) - val input = new GenericRowWithSchema(inputArray, schema) - val expected = 52D - - // When - val result = DcirMedicalActExtractor.getInstitutionCode(input) - - // Then - assert(result == expected) - - } - - "getGroupID" should "return correct status of private ambulatory" in { - // Given - val schema = StructType( - StructField(ColNames.GHSCode, DoubleType) :: - StructField(ColNames.Sector, StringType) :: - StructField(ColNames.InstitutionCode, DoubleType) :: Nil - ) - val array = Array[Any](0D, 2D, 6D) - val input = new GenericRowWithSchema(array, schema) - val expected = Success(DcirAct.groupID.PrivateAmbulatory) - - // When - val result = DcirMedicalActExtractor.getGroupId(input) - - // Then - assert(result == expected) - - } - - it should "return Success(PublicAmbulatory) if it is public related" in { - // Given - val schema = StructType(StructField(ColNames.Sector, StringType) :: Nil) - val array = Array[Any](1D) - val input = new GenericRowWithSchema(array, schema) - // When - val result = DcirMedicalActExtractor.getGroupId(input) - - // Then - result.success.value shouldBe DcirAct.groupID.PublicAmbulatory - } - - it should "return Success(Liberal) if it is liberal act" in { - // Given - val schema = StructType( - StructField(ColNames.Sector, StringType) :: StructField( - ColNames.GHSCode, - StringType - ) :: Nil - ) - val array = Array[Any](null, null) - val input = new GenericRowWithSchema(array, schema) - // When - val result = DcirMedicalActExtractor.getGroupId(input) - - // Then - result.success.value shouldBe DcirAct.groupID.Liberal - } - - it should "return Success(PrivateAmbulatory) if it is private ambulatory act" in { - // Given - val schema = StructType( - StructField(ColNames.Sector, StringType) :: StructField( - ColNames.GHSCode, - DoubleType - ) :: StructField(ColNames.InstitutionCode, DoubleType) :: Nil - ) - val array = Array[Any](null, 0D, 4D) - val input = new GenericRowWithSchema(array, schema) - // When - val result = DcirMedicalActExtractor.getGroupId(input) - - // Then - result.success.value shouldBe DcirAct.groupID.PrivateAmbulatory - } - - it should "return Success(UnkownSource) if it is an act with unknown source" in { - // Given - val schema = StructType( - StructField(ColNames.Sector, StringType) :: StructField( - ColNames.GHSCode, - DoubleType - ) :: StructField(ColNames.InstitutionCode, DoubleType) :: Nil - ) - val array = Array[Any](null, 1D, 4D) - val input = new GenericRowWithSchema(array, schema) - // When - val result = DcirMedicalActExtractor.getGroupId(input) - - // Then - result.success.value shouldBe DcirAct.groupID.Unknown - } - - it should "return IllegalArgumentException if the information of source of act is unavailable in DCIR" in { - // Given - val schema = StructType( - StructField(ColNames.GHSCode, DoubleType) :: - StructField(ColNames.Sector, StringType) :: Nil - ) - val array = Array[Any](0D, 2D, 6D) - val input = new GenericRowWithSchema(array, schema) - // When - val result = DcirMedicalActExtractor.getGroupId(input) - - // Then - result.failure.exception shouldBe an[IllegalArgumentException] - } - - "extract" should "return a Dataset of DCIR Medical Acts" in { - - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - // Given - val codes = Set("AAAA", "CCCC") - - val input = Seq( - ("Patient_A", "AAAA", "NABM1", makeTS(2010, 1, 1), None, None, None), - ("Patient_A", "BBBB", "NABM1", makeTS(2010, 2, 1), Some(1D), Some(0D), Some(1D)), - ("Patient_B", "CCCC", "NABM1", makeTS(2010, 3, 1), None, None, None), - ("Patient_B", "CCCC", "NABM1", makeTS(2010, 4, 1), Some(7D), Some(0D), Some(2D)), - ("Patient_C", "BBBB", "NABM1", makeTS(2010, 5, 1), Some(1D), Some(0D), Some(2D)) - ).toDF( - ColNames.PatientID, ColNames.CamCode, ColNames.BioCode, ColNames.Date, - ColNames.InstitutionCode, ColNames.GHSCode, ColNames.Sector - ) - - val sources = Sources(dcir = Some(input)) - - val expected = Seq[Event[MedicalAct]]( - DcirAct("Patient_A", DcirAct.groupID.Liberal, "AAAA", 1.0, makeTS(2010, 1, 1)), - DcirAct("Patient_B", DcirAct.groupID.Liberal, "CCCC", 1.0, makeTS(2010, 3, 1)), - DcirAct("Patient_B", DcirAct.groupID.PrivateAmbulatory, "CCCC", 1.0, makeTS(2010, 4, 1)) - ).toDS - - // When - val result = DcirMedicalActExtractor.extract(sources, codes) - - // Then - assertDSs(result, expected) - } -} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/McoCEMedicalActsSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/McoCEMedicalActsSuite.scala deleted file mode 100644 index de1e8797..00000000 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/McoCEMedicalActsSuite.scala +++ /dev/null @@ -1,84 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors.acts - -import java.sql.Date -import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema -import org.apache.spark.sql.types.{StringType, StructField, StructType} -import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.events.McoCEAct -import fr.polytechnique.cmap.cnam.etl.sources.Sources -import fr.polytechnique.cmap.cnam.util.functions.makeTS - -class McoCEMedicalActsSuite extends SharedContext { - - "isInStudy" should "return true if row is in study" in { - import McoCeActExtractor.ColNames - // Given - val codes = Set("coloscopie") - val schema = StructType( - StructField(ColNames.PatientID, StringType) :: - StructField(ColNames.CamCode, StringType) :: - StructField(ColNames.Date, StringType) :: Nil - ) - val data = Array[Any]("George", "coloscopie", "23012010") - val input = new GenericRowWithSchema(data, schema) - - // When - val result = McoCeActExtractor.isInStudy(codes)(input) - - // Then - assert(result) - } - - "extract" should "return acts that starts with the given codes" in { - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - // Given - val date = new Date(makeTS(2003, 2, 1).getTime) - - val input = List( - ("george", "coloscopie", date), - ("georgette", "angine", date) - ).toDF("NUM_ENQ", "MCO_FMSTC__CCAM_COD", "EXE_SOI_DTD") - - val sources = Sources(mcoCe = Some(input)) - - val expected = List( - McoCEAct("georgette", "ACE", "angine", makeTS(2003, 2, 1)) - ).toDS - - // When - val result = McoCeActExtractor.extract(sources, Set("angi")) - - // Then - assertDSs(expected, result) - } - - "extract" should "return all acts when codes are empty" in { - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - // Given - val date = new Date(makeTS(2003, 2, 1).getTime) - - val input = List( - ("george", "coloscopie", date), - ("georgette", "angine", date) - ).toDF("NUM_ENQ", "MCO_FMSTC__CCAM_COD", "EXE_SOI_DTD") - - val sources = Sources(mcoCe = Some(input)) - - val expected = List( - McoCEAct("georgette", "ACE", "angine", makeTS(2003, 2, 1)), - McoCEAct("george", "ACE", "coloscopie", makeTS(2003, 2, 1)) - ).toDS - - // When - val result = McoCeActExtractor.extract(sources, Set.empty) - - // Then - assertDSs(expected, result) - } -} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/McoMedicalActsSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/McoMedicalActsSuite.scala deleted file mode 100644 index 20aab4ac..00000000 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/McoMedicalActsSuite.scala +++ /dev/null @@ -1,100 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors.acts - -import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.events._ -import fr.polytechnique.cmap.cnam.etl.sources.Sources -import fr.polytechnique.cmap.cnam.util.functions._ - -class McoMedicalActsSuite extends SharedContext { - - "extract" should "return a DataSet of McoCIM10Act" in { - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - // Given - val cim10Codes = Set("C670") - val mco = spark.read.parquet("src/test/resources/test-input/MCO.parquet") - val expected = Seq[Event[MedicalAct]]( - McoCIM10Act("Patient_02", "10000123_10000987_2006", "C670", makeTS(2005, 12, 29)), - McoCIM10Act("Patient_02", "10000123_20000123_2007", "C670", makeTS(2007, 1, 29)), - McoCIM10Act("Patient_02", "10000123_30000546_2008", "C670", makeTS(2008, 3, 8)) - ).toDS - - val input = Sources(mco = Some(mco)) - // When - val result = McoCimMedicalActExtractor.extract(input, cim10Codes) - - // Then - assertDSs(result, expected) - } - - it should "return all available McoCIM10Act when codes is Empty" in { - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - // Given - val mco = spark.read.parquet("src/test/resources/test-input/MCO.parquet") - val expected = Seq[Event[MedicalAct]]( - McoCIM10Act("Patient_02", "10000123_10000543_2006", "C671", makeTS(2005, 12, 24)), - McoCIM10Act("Patient_02", "10000123_10000987_2006", "C670", makeTS(2005, 12, 29)), - McoCIM10Act("Patient_02", "10000123_20000123_2007", "C670", makeTS(2007, 1, 29)), - McoCIM10Act("Patient_02", "10000123_20000345_2007", "C671", makeTS(2007, 1, 29)), - McoCIM10Act("Patient_02", "10000123_30000546_2008", "C670", makeTS(2008, 3, 8)), - McoCIM10Act("Patient_02", "10000123_30000852_2008", "C671", makeTS(2008, 3, 15)) - ).toDS - - val input = Sources(mco = Some(mco)) - // When - val result = McoCimMedicalActExtractor.extract(input, Set.empty) - - // Then - assertDSs(result, expected) - } - - "extract" should "return a DataSet of McoCCAMActs" in { - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - // Given - val ccamCodes = Set("AAAA123") - val mco = spark.read.parquet("src/test/resources/test-input/MCO.parquet") - val expected = Seq[Event[MedicalAct]]( - McoCCAMAct("Patient_02", "10000123_10000987_2006", "AAAA123", makeTS(2005, 12, 29)), - McoCCAMAct("Patient_02", "10000123_20000123_2007", "AAAA123", makeTS(2007, 1, 29)), - McoCCAMAct("Patient_02", "10000123_30000546_2008", "AAAA123", makeTS(2008, 3, 8)) - ).toDS - - val input = Sources(mco = Some(mco)) - // When - val result = McoCcamActExtractor.extract(input, ccamCodes) - - // Then - assertDSs(result, expected) - } - - it should "return all available McoCCAMActs when Codes is Empty" in { - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - // Given - val mco = spark.read.parquet("src/test/resources/test-input/MCO.parquet") - val expected = Seq[Event[MedicalAct]]( - McoCCAMAct("Patient_02", "10000123_10000987_2006", "AAAA123", makeTS(2005, 12, 29)), - McoCCAMAct("Patient_02", "10000123_20000123_2007", "AAAA123", makeTS(2007, 1, 29)), - McoCCAMAct("Patient_02", "10000123_30000546_2008", "AAAA123", makeTS(2008, 3, 8)), - McoCCAMAct("Patient_02", "10000123_20000345_2007", "BBBB123", makeTS(2007, 1, 29)), - McoCCAMAct("Patient_02", "10000123_10000543_2006", "BBBB123", makeTS(2005, 12, 24)), - McoCCAMAct("Patient_02", "10000123_30000852_2008", "BBBB123", makeTS(2008, 3, 15)) - ).toDS - - val input = Sources(mco = Some(mco)) - // When - val result = McoCcamActExtractor.extract(input, Set.empty) - - // Then - assertDSs(result, expected) - } - -} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/ImbDiagnosesSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/ImbDiagnosesSuite.scala deleted file mode 100644 index 2c24adba..00000000 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/ImbDiagnosesSuite.scala +++ /dev/null @@ -1,47 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors.diagnoses - -import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.events.ImbDiagnosis -import fr.polytechnique.cmap.cnam.etl.sources.Sources -import fr.polytechnique.cmap.cnam.util.functions.makeTS - -class ImbDiagnosesSuite extends SharedContext { - - "extract" should "extract diagnosis events from raw data" in { - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - // Given - val imb = sqlContext.read.load("src/test/resources/test-input/IR_IMB_R.parquet") - val expected = Seq(ImbDiagnosis("Patient_02", "C67", makeTS(2006, 3, 13))).toDS - - val sources = Sources(irImb = Some(imb)) - // When - val output = ImbDiagnosisExtractor.extract(sources, Set("C67")) - - // Then - assertDSs(expected, output) - } - - it should "extract all diagnosis events from raw data when an Empty codes is passed" in { - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - // Given - val imb = sqlContext.read.load("src/test/resources/test-input/IR_IMB_R.parquet") - val expected = Seq( - ImbDiagnosis("Patient_02", "C67", makeTS(2006, 3, 13)), - ImbDiagnosis("Patient_02", "E11", makeTS(2006, 1, 25)), - ImbDiagnosis("Patient_02", "9999", makeTS(2006, 4, 25)) - ).toDS - - val sources = Sources(irImb = Some(imb)) - // When - val output = ImbDiagnosisExtractor.extract(sources, Set.empty) - - // Then - assertDSs(expected, output) - } -} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/DrugsExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/DrugsExtractorSuite.scala deleted file mode 100644 index 4a035fc3..00000000 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/DrugsExtractorSuite.scala +++ /dev/null @@ -1,376 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors.drugs - -import org.apache.spark.sql.Dataset -import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema -import org.apache.spark.sql.functions.lit -import org.apache.spark.sql.types.{StringType, StructField, StructType} -import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.events.{Drug, Event} -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification._ -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.families.{Antidepresseurs, Antihypertenseurs, Hypnotiques, Neuroleptiques} -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.level.{Cip13Level, MoleculeCombinationLevel, PharmacologicalLevel, TherapeuticLevel} -import fr.polytechnique.cmap.cnam.etl.sources.Sources -import fr.polytechnique.cmap.cnam.util.functions.makeTS - -class DrugsExtractorSuite extends SharedContext { - - "extract" should "return all drugs when empty family list is passed" in { - - // Given - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - val inputDF = Seq( - ("patient1", Some("9111111111111"), Some(makeTS(2014, 5, 1))), - ("patient2", Some("3400935183644"), Some(makeTS(2014, 6, 1))), - ("patient3", Some("3400935418487"), Some(makeTS(2014, 7, 1))), - ("patient4", Some("3400935183644"), Some(makeTS(2014, 8, 1))), - ("patient8", Some("3400936889651"), Some(makeTS(2014, 9, 1))) - ).toDF("NUM_ENQ", "ER_PHA_F__PHA_PRS_C13", "EXE_SOI_DTD") - - val expected: Dataset[Event[Drug]] = Seq( - Drug("patient1", "9111111111111", 1, makeTS(2014, 5, 1)), - Drug("patient2", "3400935183644", 1, makeTS(2014, 6, 1)), - Drug("patient3", "3400935418487", 1, makeTS(2014, 7, 1)), - Drug("patient4", "3400935183644", 1, makeTS(2014, 8, 1)), - Drug("patient8", "3400936889651", 1, makeTS(2014, 9, 1)) - ).toDS - - val source = new Sources( - irPha = Some( - Seq( - (Some("9111111111111"), "toto", "GC"), - (Some("3400935183644"), "toto", "GC"), - (Some("3400935418487"), "toto", "GC"), - (Some("3400936889651"), "toto", "GC") - ).toDF("PHA_CIP_C13", "PHA_ATC_C07", "PHA_CND_TOP") - .withColumn("molecule_combination", lit("")) - ), dcir = Some(inputDF) - ) - - val drugConf = DrugConfig(Cip13Level, List.empty) - - // When - val result = new DrugExtractor(drugConf).extract(source, Set.empty) - - // Then - assertDSs(result, expected) - } - - "extract" should "work correctly based on the DrugConfig Antidepresseurs" in { - - // Given - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - val inputDF = Seq( - ("patient1", Some("9111111111111"), Some(makeTS(2014, 5, 1))), - ("patient2", Some("3400935183644"), Some(makeTS(2014, 6, 1))), - ("patient3", Some("3400935418487"), Some(makeTS(2014, 7, 1))), - ("patient4", Some("3400935183644"), Some(makeTS(2014, 8, 1))), - ("patient5", Some("3400936889651"), None), - ("patient6", None, Some(makeTS(2014, 9, 1))), - ("patient8", Some("3400936889651"), Some(makeTS(2014, 9, 1))) - ).toDF("NUM_ENQ", "ER_PHA_F__PHA_PRS_C13", "EXE_SOI_DTD") - - val expected: Dataset[Event[Drug]] = Seq( - Drug("patient2", "Antidepresseurs", 1, makeTS(2014, 6, 1)), - Drug("patient4", "Antidepresseurs", 1, makeTS(2014, 8, 1)), - Drug("patient8", "Antidepresseurs", 1, makeTS(2014, 9, 1)) - ).toDS - - val source = new Sources( - irPha = Some( - Seq( - (Some("9111111111111"), "toto", "GC"), - (Some("3400935183644"), "toto", "GC"), - (Some("3400935418487"), "toto", "GC"), - (Some("3400936889651"), "toto", "GC") - ).toDF("PHA_CIP_C13", "PHA_ATC_C07", "PHA_CND_TOP") - .withColumn("molecule_combination", lit("")) - ), dcir = Some(inputDF) - ) - - val drugConf = DrugConfig(TherapeuticLevel, List(Antidepresseurs)) - - // When - val result = new DrugExtractor(drugConf).extract(source, Set.empty) - - // Then - assertDSs(result, expected) - } - - "extract" should "return expected drug purchases with Therapeutic level of classification with class Neuroleptiques" in { - - // Given - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - val inputDF = Seq( - ("patient1", Some("9111111111111"), Some(makeTS(2014, 5, 1))), - ("patient2", Some("3400930023648"), Some(makeTS(2014, 6, 1))), - ("patient3", Some("3400935183644"), Some(makeTS(2014, 7, 1))) - ).toDF("NUM_ENQ", "ER_PHA_F__PHA_PRS_C13", "EXE_SOI_DTD") - - val expected: Dataset[Event[Drug]] = Seq( - Drug("patient2", "Neuroleptiques", 2, makeTS(2014, 6, 1)) - ).toDS - - val source = new Sources( - irPha = Some( - Seq( - (Some("9111111111111"), "toto", "NGC"), - (Some("3400935183644"), "toto", "NGC"), - (Some("3400930023648"), "toto", "NGC") - ).toDF("PHA_CIP_C13", "PHA_ATC_C07", "PHA_CND_TOP") - .withColumn("molecule_combination", lit("")) - ), dcir = Some(inputDF) - ) - - val drugConf = DrugConfig(TherapeuticLevel, List(Neuroleptiques)) - // When - val result = new DrugExtractor(drugConf).extract(source, Set.empty) - - // Then - assertDSs(result, expected) - } - - "extract" should "return expected drug purchases with Therapeutic level of classification with class Hypnotiques" in { - - // Given - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - val inputDF = Seq( - ("patient1", Some("3400930081143"), Some(makeTS(2014, 6, 1))), - ("patient2", Some("3400936099777"), Some(makeTS(2014, 7, 1))) - ).toDF("NUM_ENQ", "ER_PHA_F__PHA_PRS_C13", "EXE_SOI_DTD") - - val expected: Dataset[Event[Drug]] = Seq( - Drug("patient1", "Hypnotiques", 2, makeTS(2014, 6, 1)), - Drug("patient2", "Hypnotiques", 1, makeTS(2014, 7, 1)) - ).toDS - - val source = new Sources( - irPha = Some( - Seq( - (Some("3400930081143"), "toto", "NGC"), - (Some("3400936099777"), "toto", "GC") - ).toDF("PHA_CIP_C13", "PHA_ATC_C07", "PHA_CND_TOP") - .withColumn("molecule_combination", lit("")) - ), dcir = Some(inputDF) - ) - - val drugConf = DrugConfig(TherapeuticLevel, List(Hypnotiques)) - // When - val result = new DrugExtractor(drugConf).extract(source, Set.empty) - - // Then - assertDSs(result, expected) - } - - "extract" should "return expected drug purchases with Therapeutic level of classification with class Antihypertenseurs" in { - - // Given - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - val inputDF = Seq( - ("patient1", Some("3400937354004"), Some(makeTS(2014, 6, 1))), - ("patient2", Some("3400936099777"), Some(makeTS(2014, 7, 1))) - ).toDF("NUM_ENQ", "ER_PHA_F__PHA_PRS_C13", "EXE_SOI_DTD") - - val expected: Dataset[Event[Drug]] = Seq( - Drug("patient1", "Antihypertenseurs", 1, makeTS(2014, 6, 1)) - ).toDS - - - val source = new Sources( - irPha = Some( - Seq( - (Some("3400937354004"), "toto", "GC"), - (Some("3400936099777"), "toto", "GC") - ).toDF("PHA_CIP_C13", "PHA_ATC_C07", "PHA_CND_TOP") - .withColumn("molecule_combination", lit("")) - ), dcir = Some(inputDF) - ) - - val drugConf = DrugConfig(TherapeuticLevel, List(Antihypertenseurs)) - - // When - val result = new DrugExtractor(drugConf).extract(source, Set.empty) - - // Then - assertDSs(result, expected) - } - - "extract" should "return expected drug purchases with Therapeutic level" in { - - // Given - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - val inputDF = Seq( - ("patient1", Some("9111111111111"), Some(makeTS(2014, 5, 1))), - ("patient2", Some("3400935183644"), Some(makeTS(2014, 6, 1))), - ("patient3", Some("3400935418487"), Some(makeTS(2014, 7, 1))), - ("patient4", Some("3400935183644"), Some(makeTS(2014, 8, 1))), - ("patient5", Some("3400936889651"), None), - ("patient6", None, Some(makeTS(2014, 9, 1))), - ("patient8", Some("3400936889651"), Some(makeTS(2014, 9, 1))), - ("patient1", Some("9111111111111"), Some(makeTS(2014, 5, 1))), - ("patient2", Some("3400930023648"), Some(makeTS(2014, 6, 1))) - ).toDF("NUM_ENQ", "ER_PHA_F__PHA_PRS_C13", "EXE_SOI_DTD") - - val expected: Dataset[Event[Drug]] = Seq( - Drug("patient2", "Antidepresseurs", 1, makeTS(2014, 6, 1)), - Drug("patient4", "Antidepresseurs", 1, makeTS(2014, 8, 1)), - Drug("patient8", "Antidepresseurs", 1, makeTS(2014, 9, 1)), - Drug("patient2", "Neuroleptiques", 1, makeTS(2014, 6, 1)) - ).toDS - - val source = new Sources( - irPha = Some( - Seq( - (Some("9111111111111"), "toto", "GC"), - (Some("3400935183644"), "N06AA04", "GC"), - (Some("3400935418487"), "A10BB09", "GC"), - (Some("3400936889651"), "N06AB03", "GC"), - (Some("3400930023648"), "N05AX12", "GC") - ).toDF("PHA_CIP_C13", "PHA_ATC_C07", "PHA_CND_TOP") - .withColumn("molecule_combination", lit("")) - ), dcir = Some(inputDF) - ) - - val drugConfigAntidepresseurs: DrugClassConfig = Antidepresseurs - val drugConfigNeuroleptiques: DrugClassConfig = Neuroleptiques - val drugConf = DrugConfig(TherapeuticLevel, List(drugConfigAntidepresseurs, drugConfigNeuroleptiques)) - // When - val result = new DrugExtractor(drugConf).extract(source, Set.empty) - - // Then - assertDSs(result, expected) - } - - "extract" should "return expected drug purchases with Pharmacological level" in { - - // Given - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - val inputDF = Seq( - ("patient1", Some("9111111111111"), Some(makeTS(2014, 5, 1))), - ("patient2", Some("3400935183644"), Some(makeTS(2014, 6, 1))), - ("patient3", Some("3400935418487"), Some(makeTS(2014, 7, 1))), - ("patient4", Some("3400935183644"), Some(makeTS(2014, 8, 1))), - ("patient5", Some("3400936889651"), None), - ("patient6", None, Some(makeTS(2014, 9, 1))), - ("patient8", Some("3400936889651"), Some(makeTS(2014, 9, 1))), - ("patient1", Some("9111111111111"), Some(makeTS(2014, 5, 1))), - ("patient2", Some("3400930023648"), Some(makeTS(2014, 6, 1))) - ).toDF("NUM_ENQ", "ER_PHA_F__PHA_PRS_C13", "EXE_SOI_DTD") - - val expected: Dataset[Event[Drug]] = Seq( - Drug("patient2", "Antidepresseurs_Tricycliques", 1, makeTS(2014, 6, 1)), - Drug("patient4", "Antidepresseurs_Tricycliques", 1, makeTS(2014, 8, 1)), - Drug("patient8", "Antidepresseurs_ISRS", 1, makeTS(2014, 9, 1)), - Drug("patient2", "Neuroleptiques_Autres_neuroleptiques", 1, makeTS(2014, 6, 1)) - ).toDS - - val source = new Sources( - irPha = Some( - Seq( - (Some("9111111111111"), "toto", "GC"), - (Some("3400935183644"), "N06AA04", "GC"), - (Some("3400935418487"), "A10BB09", "GC"), - (Some("3400936889651"), "N06AB03", "GC"), - (Some("3400930023648"), "N05AX12", "GC") - ).toDF("PHA_CIP_C13", "PHA_ATC_C07", "PHA_CND_TOP") - .withColumn("molecule_combination", lit("")) - ), dcir = Some(inputDF) - ) - - val drugConfigAntidepresseurs: DrugClassConfig = Antidepresseurs - val drugConfigNeuroleptiques: DrugClassConfig = Neuroleptiques - val drugConf = DrugConfig(PharmacologicalLevel, List(drugConfigAntidepresseurs, drugConfigNeuroleptiques)) - // When - val result = new DrugExtractor(drugConf).extract(source, Set.empty) - - // Then - assertDSs(result, expected) - } - - "extract" should "return expected drug purchases with MoleculeCombination level" in { - - // Given - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - val inputDF = Seq( - ("patient1", Some("9111111111111"), Some(makeTS(2014, 5, 1))), - ("patient2", Some("3400935183644"), Some(makeTS(2014, 6, 1))), - ("patient3", Some("3400935418487"), Some(makeTS(2014, 7, 1))), - ("patient4", Some("3400935183644"), Some(makeTS(2014, 8, 1))), - ("patient5", Some("3400936889651"), None), - ("patient6", None, Some(makeTS(2014, 9, 1))), - ("patient8", Some("3400936889651"), Some(makeTS(2014, 9, 1))), - ("patient1", Some("9111111111111"), Some(makeTS(2014, 5, 1))), - ("patient2", Some("3400930023648"), Some(makeTS(2014, 6, 1))) - ).toDF("NUM_ENQ", "ER_PHA_F__PHA_PRS_C13", "EXE_SOI_DTD") - - val expected: Dataset[Event[Drug]] = Seq( - Drug("patient2", "N06AA04", 1, makeTS(2014, 6, 1)), - Drug("patient4", "N06AA04", 1, makeTS(2014, 8, 1)), - Drug("patient8", "DEXTROPROPOXYPHENE_PARACETAMOL_CAFEINE", 1, makeTS(2014, 9, 1)), - Drug("patient2", "INSULINE LISPRO (PROTAMINE)", 1, makeTS(2014, 6, 1)) - ).toDS - - val irPha = Seq( - (Some("9111111111111"), "toto", "toto", "GC"), - (Some("3400935183644"), "N06AA04", "N06AA04", "GC"), - (Some("3400935418487"), "A10BB09", "A10BB09", "GC"), - (Some("3400936889651"), "N06AB03", "DEXTROPROPOXYPHENE_PARACETAMOL_CAFEINE", "GC"), - (Some("3400930023648"), "N05AX12", "INSULINE LISPRO (PROTAMINE)", "GC") - ).toDF("PHA_CIP_C13", "PHA_ATC_C07", "molecule_combination", "PHA_CND_TOP") - - val source = new Sources(irPha = Some(irPha), dcir = Some(inputDF)) - - val drugConfigAntidepresseurs: DrugClassConfig = Antidepresseurs - val drugConfigNeuroleptiques: DrugClassConfig = Neuroleptiques - val drugConf = DrugConfig(MoleculeCombinationLevel, List(drugConfigAntidepresseurs, drugConfigNeuroleptiques)) - // When - val result = new DrugExtractor(drugConf).extract(source, Set("SHIT")) - - // Then - assertDSs(result, expected) - } - - "extractGroupId" should "return the group ID for done values" in { - // Given - val schema = StructType( - Seq( - StructField("FLX_DIS_DTD", StringType), - StructField("FLX_TRT_DTD", StringType), - StructField("FLX_EMT_TYP", StringType), - StructField("FLX_EMT_NUM", StringType), - StructField("FLX_EMT_ORD", StringType), - StructField("ORG_CLE_NUM", StringType), - StructField("DCT_ORD_NUM", StringType) - ) - ) - - val values = Array[Any]("2014-08-01", "2014-08-17", "1", "17", "0", "01C673000", "1759") - val r = new GenericRowWithSchema(values, schema) - val expected = "MjAxNC0wOC0wMV8yMDE0LTA4LTE3XzFfMTdfMF8wMUM2NzMwMDBfMTc1OQ==" - - val drugConf = DrugConfig(Cip13Level, List.empty) - - // When - val result = new DrugExtractor(drugConf).extractGroupId(r) - - // Then - assert(result == expected) - } -} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/DcirBiologyActsSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/DcirBiologyActsSuite.scala new file mode 100644 index 00000000..b913a522 --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/DcirBiologyActsSuite.scala @@ -0,0 +1,198 @@ +package fr.polytechnique.cmap.cnam.etl.extractors.events.acts + +import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema +import org.apache.spark.sql.types._ +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.events.{BiologyDcirAct, DcirAct, Event, MedicalAct} +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.dcir.DcirSource +import fr.polytechnique.cmap.cnam.etl.sources.Sources +import fr.polytechnique.cmap.cnam.util.functions.makeTS + + +class DcirBiologyActsSuite extends SharedContext { + + val colNames = new DcirSource {}.ColNames + + val schema = StructType( + StructField(colNames.PatientID, StringType) :: + StructField(colNames.BioCode, StringType) :: + StructField(colNames.InstitutionCode, DoubleType) :: + StructField(colNames.GHSCode, DoubleType) :: + StructField(colNames.Sector, DoubleType) :: + StructField(colNames.FlowDistributionDate, DateType) :: Nil + ) + + val oldSchema = StructType( + StructField(colNames.PatientID, StringType) :: + StructField(colNames.BioCode, StringType) :: + StructField(colNames.FlowDistributionDate, DateType) :: Nil + ) + + it should "return false when no code is found in the row" in { + + // Given + val codes = SimpleExtractorCodes(List("AAAA", "BBBB")) + val inputArray = Array[Any]("Patient_A", "CCCC", 1D, 0D, 1D, makeTS(2010, 1, 1)) + val inputRow = new GenericRowWithSchema(inputArray, schema) + + // When + val result = DcirBiologyActExtractor(codes).isInStudy(inputRow) + + // Then + assert(!result) + } + + "getGroupID" should "return correct status of private ambulatory" in { + // Given + val schema = StructType( + StructField(colNames.GHSCode, DoubleType) :: + StructField(colNames.Sector, StringType) :: + StructField(colNames.InstitutionCode, DoubleType) :: Nil + ) + val array = Array[Any](0D, 2D, 6D) + val input = new GenericRowWithSchema(array, schema) + + // When + val result = DcirBiologyActExtractor(SimpleExtractorCodes(List("AAAA", "BBBB"))).extractGroupId(input) + + // Then + assert(result == DcirAct.groupID.PrivateAmbulatory) + + } + + it should "return Success(PublicAmbulatory) if it is public related" in { + // Given + val schema = StructType(StructField(colNames.Sector, StringType) :: Nil) + val array = Array[Any](1D) + val input = new GenericRowWithSchema(array, schema) + // When + val result = DcirBiologyActExtractor(SimpleExtractorCodes.empty).extractGroupId(input) + + // Then + assert(result == DcirAct.groupID.PublicAmbulatory) + } + + it should "return Success(Liberal) if it is liberal act" in { + // Given + val schema = StructType( + StructField(colNames.Sector, StringType) :: StructField( + colNames.GHSCode, + StringType + ) :: Nil + ) + val array = Array[Any](null, null) + val input = new GenericRowWithSchema(array, schema) + // When + val result = DcirBiologyActExtractor(SimpleExtractorCodes.empty).extractGroupId(input) + + // Then + assert(result == DcirAct.groupID.Liberal) + } + + it should "return Success(PrivateAmbulatory) if it is private ambulatory act" in { + // Given + val schema = StructType( + StructField(colNames.Sector, StringType) :: StructField( + colNames.GHSCode, + DoubleType + ) :: StructField(colNames.InstitutionCode, DoubleType) :: Nil + ) + val array = Array[Any](null, 0D, 4D) + val input = new GenericRowWithSchema(array, schema) + // When + val result = DcirBiologyActExtractor(SimpleExtractorCodes.empty).extractGroupId(input) + + // Then + assert(result == DcirAct.groupID.PrivateAmbulatory) + } + + it should "return Success(UnkownSource) if it is an act with unknown source" in { + // Given + val schema = StructType( + StructField(colNames.Sector, StringType) :: + StructField(colNames.GHSCode, DoubleType) :: + StructField(colNames.InstitutionCode, DoubleType) :: + Nil + ) + val array = Array[Any](null, 1D, 4D) + val input = new GenericRowWithSchema(array, schema) + // When + val result = DcirBiologyActExtractor(SimpleExtractorCodes.empty).extractGroupId(input) + + // Then + assert(result == DcirAct.groupID.Unknown) + } + + it should "return default value if the information of source of act is unavailable in DCIR" in { + // Given + val schema = StructType( + StructField(colNames.GHSCode, DoubleType) :: + StructField(colNames.Sector, StringType) :: Nil + ) + val array = Array[Any](0D, 2D, 6D) + val input = new GenericRowWithSchema(array, schema) + // When + val result = DcirBiologyActExtractor(SimpleExtractorCodes.empty).extractGroupId(input) + + // Then + assert(result == DcirAct.groupID.DcirAct) + } + + /* "extract" should "return a Dataset of DCIR Biology Acts" in { + + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val codes = BaseExtractorCodes(List("AAAA", "CCCC")) + + val input = Seq( + ("Patient_A", "AAAA", "CCAM1", Some(makeTS(2010, 1, 1)), None, None, None, makeTS(2010, 1, 1)), + ("Patient_A", "BBBB", "CCAM1", Some(makeTS(2010, 2, 1)), Some(1D), Some(0D), Some(1D), makeTS(2010, 2, 1)), + ("Patient_B", "CCCC", "CCAM1", Some(makeTS(2010, 3, 1)), None, None, None, makeTS(2010, 3, 1)), + ("Patient_B", "CCCC", "CCAM1", Some(makeTS(2010, 4, 1)), Some(7D), Some(0D), Some(2D), makeTS(2010, 4, 1)), + ("Patient_C", "BBBB", "CCAM1", None, Some(1D), Some(0D), Some(2D), makeTS(2010, 5, 1)) + ).toDF( + colNames.PatientID, colNames.BioCode, colNames.CamCode, colNames.FlowDistributionDate, + colNames.InstitutionCode, colNames.GHSCode, colNames.Sector, colNames.DcirFluxDate + ) + + val sources = Sources(dcir = Some(input)) + + val expected = Seq[Event[MedicalAct]]( + BiologyDcirAct("Patient_A", BiologyDcirAct.groupID.Liberal, "AAAA", 1.0, makeTS(2010, 1, 1)), + BiologyDcirAct("Patient_B", BiologyDcirAct.groupID.Liberal, "CCCC", 1.0, makeTS(2010, 3, 1)), + BiologyDcirAct("Patient_B", BiologyDcirAct.groupID.PrivateAmbulatory, "CCCC", 1.0, makeTS(2010, 4, 1)) + ).toDS + + // When + val result = DcirBiologyActExtractor(codes).extract(sources) + + // Then + assertDSs(result, expected) + }*/ + + "extract" should "return a Dataset of DCIR Biology Acts from raw data" in { + + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val codes = SimpleExtractorCodes(List("238")) + + val input = sqlCtx.read.parquet("src/test/resources/test-input/DCIR_w_BIO.parquet") + + val sources = Sources(dcir = Some(input)) + + val expected = Seq[Event[MedicalAct]]( + BiologyDcirAct("Patient_01", BiologyDcirAct.groupID.Liberal, "238", 0.0, makeTS(2006, 1, 15)) + ).toDS + + // When + val result = DcirBiologyActExtractor(codes).extract(sources) + + // Then + assertDSs(result, expected) + } +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/DcirMedicalActsSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/DcirMedicalActsSuite.scala new file mode 100644 index 00000000..1c099c4e --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/DcirMedicalActsSuite.scala @@ -0,0 +1,63 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.acts + +import org.apache.spark.sql.types._ +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.extractors.sources.dcir.DcirSource + +class DcirMedicalActsSuite extends SharedContext { + + val colNames = new DcirSource {}.ColNames + + val schema = StructType( + StructField(colNames.PatientID, StringType) :: + StructField(colNames.CamCode, StringType) :: + StructField(colNames.InstitutionCode, DoubleType) :: + StructField(colNames.GHSCode, DoubleType) :: + StructField(colNames.Sector, DoubleType) :: + StructField(colNames.FlowDistributionDate, DateType) :: Nil + ) + + val oldSchema = StructType( + StructField(colNames.PatientID, StringType) :: + StructField(colNames.CamCode, StringType) :: + StructField(colNames.FlowDistributionDate, DateType) :: Nil + ) + + /* "extract" should "return a Dataset of DCIR Medical Acts" in { + + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val codes = BaseExtractorCodes(List("AAAA", "CCCC", "DDDD")) + + val input = Seq( + ("Patient_A", "AAAA", "NABM1", makeTS(2010, 1, 1), None, None, None), + ("Patient_A", "BBBB", "NABM1", makeTS(2010, 2, 1), Some(1D), Some(0D), Some(1D)), + ("Patient_B", "CCCC", "NABM1", makeTS(2010, 3, 1), None, None, None), + ("Patient_B", "CCCC", "NABM1", makeTS(2010, 4, 1), Some(7D), Some(0D), Some(2D)), + ("Patient_C", "BBBB", "NABM1", makeTS(2010, 5, 1), Some(1D), Some(0D), Some(2D)), + ("Patient_D", "DDDD", "NABM1", null, None, None, None) + ).toDF( + colNames.PatientID, colNames.CamCode, colNames.BioCode, colNames.FlowDistributionDate, + colNames.InstitutionCode, colNames.GHSCode, colNames.Sector + ) + + val sources = Sources(dcir = Some(input)) + + val expected = Seq[Event[MedicalAct]]( + DcirAct("Patient_A", DcirAct.groupID.Liberal, "AAAA", 1.0, makeTS(2010, 1, 1)), + DcirAct("Patient_B", DcirAct.groupID.Liberal, "CCCC", 1.0, makeTS(2010, 3, 1)), + DcirAct("Patient_B", DcirAct.groupID.PrivateAmbulatory, "CCCC", 1.0, makeTS(2010, 4, 1)), + DcirAct("Patient_D", DcirAct.groupID.Liberal, "DDDD", 1.0, makeTS(1970, 1, 1)) + ).toDS + + // When + val result = DcirMedicalActExtractor(codes).extract(sources) + + // Then + assertDSs(result, expected) + }*/ +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/HadMedicalActsSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/HadMedicalActsSuite.scala similarity index 74% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/HadMedicalActsSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/HadMedicalActsSuite.scala index 539089e6..aeed4987 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/HadMedicalActsSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/HadMedicalActsSuite.scala @@ -1,18 +1,19 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.acts +package fr.polytechnique.cmap.cnam.etl.extractors.events.acts import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.events._ +import fr.polytechnique.cmap.cnam.etl.events.{Event, HadCCAMAct, MedicalAct} +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes import fr.polytechnique.cmap.cnam.etl.sources.Sources -import fr.polytechnique.cmap.cnam.util.functions._ +import fr.polytechnique.cmap.cnam.util.functions.makeTS class HadMedicalActsSuite extends SharedContext { - + "extract" should "return a DataSet of HadCCAMActs" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val ccamCodes = Set("HPQD001") + val ccamCodes = SimpleExtractorCodes(List("HPQD001")) val had = spark.read.parquet("src/test/resources/test-input/HAD.parquet") val expected = Seq[Event[MedicalAct]]( HadCCAMAct("patient02", "10000201_30000150_2019", "HPQD001", makeTS(2019, 12, 24)), @@ -21,7 +22,7 @@ class HadMedicalActsSuite extends SharedContext { val input = Sources(had = Some(had)) // When - val result = HadCcamActExtractor.extract(input, ccamCodes) + val result = HadCcamActExtractor(ccamCodes).extract(input) // Then assertDSs(result, expected) @@ -41,7 +42,7 @@ class HadMedicalActsSuite extends SharedContext { val input = Sources(had = Some(had)) // When - val result = HadCcamActExtractor.extract(input, Set.empty) + val result = HadCcamActExtractor(SimpleExtractorCodes.empty).extract(input) // Then assertDSs(result, expected) diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/McoCEMedicalActsSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/McoCEMedicalActsSuite.scala new file mode 100644 index 00000000..483b54f1 --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/McoCEMedicalActsSuite.scala @@ -0,0 +1,71 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.acts + +import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema +import org.apache.spark.sql.types.{StringType, StructField, StructType} +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.events.{Event, McoCeCcamAct, MedicalAct} +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.mcoce.McoCeSource +import fr.polytechnique.cmap.cnam.etl.sources.Sources +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +class McoCEMedicalActsSuite extends SharedContext { + + "isInStudy" should "return true if row is in study" in { + val colNames = new McoCeSource {}.ColNames + // Given + val codes = SimpleExtractorCodes(List("coloscopie")) + val schema = StructType( + StructField(colNames.PatientID, StringType) :: + StructField(colNames.CamCode, StringType) :: + StructField(colNames.StartDate, StringType) :: Nil + ) + val data = Array[Any]("George", "coloscopie", "23012010") + val input = new GenericRowWithSchema(data, schema) + + // When + val result = McoCeCcamActExtractor(codes).isInStudy(input) + + // Then + assert(result) + } + + "extract" should "return acts that starts with the given codes" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ + // Given + val cim10Codes = SimpleExtractorCodes(List("DEM")) + val mcoCe = spark.read.parquet("src/test/resources/test-input/MCO_CE.parquet") + val expected = Seq[Event[MedicalAct]]( + McoCeCcamAct("200410", "190000059_00022621_2014", "DEMP002", makeTS(2014, 4, 18)) + ).toDS + + val input = Sources(mcoCe = Some(mcoCe)) + // When + val result = McoCeCcamActExtractor(cim10Codes).extract(input) + + // Then + assertDSs(expected, result) + } + + "extract" should "return all acts when codes are empty" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ + // Given + val mcoCe = spark.read.parquet("src/test/resources/test-input/MCO_CE.parquet") + val expected = Seq[Event[MedicalAct]]( + McoCeCcamAct("200410", "190000059_00022621_2014", "DEMP002", makeTS(2014, 4, 18)), + McoCeCcamAct("2004100010", "390780146_00098382_2014", "DZQM006", makeTS(2014, 11, 6)), + McoCeCcamAct("2004100010", "390780146_00015211_2014", "DEQP005", makeTS(2014, 2, 11)) + ).toDS + + val input = Sources(mcoCe = Some(mcoCe)) + // When + val result = McoCeCcamActExtractor(SimpleExtractorCodes.empty).extract(input) + + // Then + assertDSs(expected, result) + } +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/McoMedicalActsSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/McoMedicalActsSuite.scala new file mode 100644 index 00000000..6cd06ea4 --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/McoMedicalActsSuite.scala @@ -0,0 +1,57 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.acts + +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.events._ +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.sources.Sources +import fr.polytechnique.cmap.cnam.util.functions._ + +class McoMedicalActsSuite extends SharedContext { + + "extract" should "return a DataSet of McoCCAMActs" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val ccamCodes = SimpleExtractorCodes(List("AAAA123")) + val mco = spark.read.parquet("src/test/resources/test-input/MCO.parquet") + val expected = Seq[Event[MedicalAct]]( + McoCCAMAct("Patient_02", "10000123_10000987_2006", "AAAA123", makeTS(2005, 12, 31)), + McoCCAMAct("Patient_02", "10000123_20000123_2007", "AAAA123", makeTS(2007, 1, 31)), + McoCCAMAct("Patient_02", "10000123_30000546_2008", "AAAA123", makeTS(2008, 3, 10)) + ).toDS + + val input = Sources(mco = Some(mco)) + // When + val result = McoCcamActExtractor(ccamCodes).extract(input) + + // Then + assertDSs(result, expected) + } + + it should "return all available McoCCAMActs when Codes is Empty" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val mco = spark.read.parquet("src/test/resources/test-input/MCO.parquet") + val expected = Seq[Event[MedicalAct]]( + McoCCAMAct("Patient_02", "10000123_10000987_2006", "AAAA123", makeTS(2005, 12, 31)), + McoCCAMAct("Patient_02", "10000123_20000123_2007", "AAAA123", makeTS(2007, 1, 31)), + McoCCAMAct("Patient_02", "10000123_30000546_2008", "AAAA123", makeTS(2008, 3, 10)), + McoCCAMAct("Patient_02", "10000123_20000345_2007", "BBBB123", makeTS(2007, 1, 31)), + McoCCAMAct("Patient_02", "10000123_10000543_2006", "BBBB123", makeTS(2005, 12, 26)), + McoCCAMAct("Patient_02", "10000123_30000852_2008", "BBBB123", makeTS(2008, 3, 17)) + ).toDS + + val input = Sources(mco = Some(mco)) + // When + val result = McoCcamActExtractor(SimpleExtractorCodes.empty).extract(input) + + // Then + assertDSs(result, expected) + } + +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/MedicalActsConfigSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/MedicalActsConfigSuite.scala similarity index 83% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/MedicalActsConfigSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/MedicalActsConfigSuite.scala index 7c5cf1a4..e0f49b2a 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/MedicalActsConfigSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/MedicalActsConfigSuite.scala @@ -1,4 +1,4 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.acts +package fr.polytechnique.cmap.cnam.etl.extractors.events.acts import org.scalatest.matchers.should.Matchers.{a, convertToAnyShouldWrapper} import fr.polytechnique.cmap.cnam.SharedContext diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/SsrCEMedicalActsSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/SsrCEMedicalActsSuite.scala new file mode 100644 index 00000000..70e3525c --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/SsrCEMedicalActsSuite.scala @@ -0,0 +1,85 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.acts + +import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema +import org.apache.spark.sql.types.{DateType, StringType, StructField, StructType} +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.events.{Event, MedicalAct, SsrCEAct} +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.ssrce.SsrCeSource +import fr.polytechnique.cmap.cnam.etl.sources.Sources +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +class SsrCEMedicalActsSuite extends SharedContext { + + val colNames = new SsrCeSource {}.ColNames + + val schema = StructType( + StructField(colNames.PatientID, StringType) :: + StructField(colNames.CamCode, StringType) :: + StructField(colNames.StartDate, DateType) :: Nil + ) + + "isInStudy" should "return true when a study code is found in the row" in { + + // Given + val codes = SimpleExtractorCodes(List("AAAA", "BBBB")) + val inputArray = Array[Any]("Patient_A", "AAAA", makeTS(2010, 1, 1)) + val inputRow = new GenericRowWithSchema(inputArray, schema) + + // When + val result = SsrCeActExtractor(codes).isInStudy(inputRow) + + // Then + assert(result) + } + + it should "return false when no code is found in the row" in { + + // Given + val codes = SimpleExtractorCodes(List("AAAA", "BBBB")) + val inputArray = Array[Any]("Patient_A", "CCCC", makeTS(2010, 1, 1)) + val inputRow = new GenericRowWithSchema(inputArray, schema) + + // When + val result = SsrCeActExtractor(codes).isInStudy(inputRow) + + // Then + assert(!result) + } + + "extract" should "return a Dataset of Ssr CE Medical Acts" in { + + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val codes = SimpleExtractorCodes(List("AAAA", "CCCC")) + + val input = Seq( + ("Patient_A", "AAAA", makeTS(2010, 1, 1)), + ("Patient_A", "BBBB", makeTS(2010, 2, 1)), + ("Patient_B", "CCCC", makeTS(2010, 3, 1)), + ("Patient_B", "CCCC", makeTS(2010, 4, 1)), + ("Patient_C", "BBBB", makeTS(2010, 5, 1)) + ).toDF( + colNames.PatientID, colNames.CamCode, colNames.StartDate + ) + + val sources = Sources(ssrCe = Some(input)) + + val expected = Seq[Event[MedicalAct]]( + SsrCEAct("Patient_A", "NA", "AAAA", 0.0, makeTS(2010, 1, 1)), + SsrCEAct("Patient_B", "NA", "CCCC", 0.0, makeTS(2010, 3, 1)), + SsrCEAct("Patient_B", "NA", "CCCC", 0.0, makeTS(2010, 4, 1)) + ).toDS + + // When + val result = SsrCeActExtractor(codes).extract(sources) + + // Then + assertDSs(result, expected) + } + +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/SsrMedicalActsSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/SsrMedicalActsSuite.scala similarity index 83% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/SsrMedicalActsSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/SsrMedicalActsSuite.scala index 8587be84..ba0dd19f 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/acts/SsrMedicalActsSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/acts/SsrMedicalActsSuite.scala @@ -1,7 +1,8 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.acts +package fr.polytechnique.cmap.cnam.etl.extractors.events.acts import fr.polytechnique.cmap.cnam.SharedContext import fr.polytechnique.cmap.cnam.etl.events._ +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes import fr.polytechnique.cmap.cnam.etl.sources.Sources import fr.polytechnique.cmap.cnam.util.functions._ @@ -12,7 +13,7 @@ class SsrMedicalActsSuite extends SharedContext { import sqlCtx.implicits._ // Given - val ccamCodes = Set("AHQP001") + val ccamCodes = SimpleExtractorCodes(List("AHQP001")) val ssr = spark.read.parquet("src/test/resources/test-joined/SSR.parquet") val expected = Seq[Event[MedicalAct]]( SsrCCAMAct("Patient_02", "10000123_30000546_200_2019", "AHQP001", makeTS(2019, 8, 11)), @@ -21,7 +22,7 @@ class SsrMedicalActsSuite extends SharedContext { val input = Sources(ssr = Some(ssr)) // When - val result = SsrCcamActExtractor.extract(input, ccamCodes) + val result = SsrCcamActExtractor(ccamCodes).extract(input) // Then assertDSs(result, expected) @@ -42,7 +43,7 @@ class SsrMedicalActsSuite extends SharedContext { val input = Sources(ssr = Some(ssr)) // When - val result = SsrCcamActExtractor.extract(input, Set.empty) + val result = SsrCcamActExtractor(SimpleExtractorCodes.empty).extract(input) // Then assertDSs(result, expected) @@ -53,7 +54,7 @@ class SsrMedicalActsSuite extends SharedContext { import sqlCtx.implicits._ // Given - val ccamCodes = Set("BLR+156") + val ccarrCodes = SimpleExtractorCodes(List("BLR+156")) val ssr = spark.read.parquet("src/test/resources/test-joined/SSR.parquet") val expected = Seq[Event[MedicalAct]]( SsrCSARRAct("Patient_02", "10000123_30000546_200_2019", "BLR+156", makeTS(2019, 8, 11)), @@ -62,7 +63,7 @@ class SsrMedicalActsSuite extends SharedContext { val input = Sources(ssr = Some(ssr)) // When - val result = SsrCsarrActExtractor.extract(input, ccamCodes) + val result = SsrCsarrActExtractor(ccarrCodes).extract(input) // Then assertDSs(result, expected) @@ -83,7 +84,7 @@ class SsrMedicalActsSuite extends SharedContext { val input = Sources(ssr = Some(ssr)) // When - val result = SsrCsarrActExtractor.extract(input, Set.empty) + val result = SsrCsarrActExtractor(SimpleExtractorCodes.empty).extract(input) // Then assertDSs(result, expected) diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/classifications/GHMClassificationsSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/classifications/GHMClassificationsSuite.scala similarity index 84% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/classifications/GHMClassificationsSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/classifications/GHMClassificationsSuite.scala index c87b4990..007555d1 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/classifications/GHMClassificationsSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/classifications/GHMClassificationsSuite.scala @@ -1,9 +1,10 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.classifications +package fr.polytechnique.cmap.cnam.etl.extractors.events.classifications import fr.polytechnique.cmap.cnam.SharedContext import fr.polytechnique.cmap.cnam.etl.events.GHMClassification +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes import fr.polytechnique.cmap.cnam.etl.sources.Sources import fr.polytechnique.cmap.cnam.util.functions._ @@ -15,7 +16,7 @@ class GHMClassificationsSuite extends SharedContext { // Given val mco = spark.read.parquet("src/test/resources/test-input/MCO.parquet") - val ghmCodes = Set("12H50L") + val ghmCodes = SimpleExtractorCodes(List("12H50L")) val expected = Seq( GHMClassification("Patient_02", "10000123_20000123_2007", "12H50L", makeTS(2007, 1, 29)), @@ -25,7 +26,7 @@ class GHMClassificationsSuite extends SharedContext { val sources = Sources(mco = Some(mco)) // When - val result = GhmExtractor.extract(sources, ghmCodes) + val result = GhmExtractor(ghmCodes).extract(sources) // Then assertDSs(result, expected) @@ -50,7 +51,7 @@ class GHMClassificationsSuite extends SharedContext { val sources = Sources(mco = Some(mco)) // When - val result = GhmExtractor.extract(sources, Set.empty) + val result = GhmExtractor(SimpleExtractorCodes.empty).extract(sources) // Then assertDSs(result, expected) diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/HadDiagnosesSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/HadDiagnosesSuite.scala similarity index 77% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/HadDiagnosesSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/HadDiagnosesSuite.scala index 0cbb83fd..bf97c9b0 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/HadDiagnosesSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/HadDiagnosesSuite.scala @@ -1,7 +1,8 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.diagnoses +package fr.polytechnique.cmap.cnam.etl.extractors.events.diagnoses import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.events._ +import fr.polytechnique.cmap.cnam.etl.events.{Diagnosis, Event, HadAssociatedDiagnosis, HadMainDiagnosis} +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes import fr.polytechnique.cmap.cnam.etl.sources.Sources import fr.polytechnique.cmap.cnam.util.functions.makeTS @@ -12,7 +13,7 @@ class HadDiagnosesSuite extends SharedContext { import sqlCtx.implicits._ // Given - val dpCodes = Set("G970") + val dpCodes = SimpleExtractorCodes(List("G970")) val had = spark.read.parquet("src/test/resources/test-input/HAD.parquet") val sources = Sources(had = Some(had)) @@ -22,7 +23,7 @@ class HadDiagnosesSuite extends SharedContext { ).toDS // When - val result = HadMainDiagnosisExtractor.extract(sources, dpCodes) + val result = HadMainDiagnosisExtractor(dpCodes).extract(sources) // Then assertDSs(result, expected) @@ -43,7 +44,7 @@ class HadDiagnosesSuite extends SharedContext { ).toDS // When - val result = HadMainDiagnosisExtractor.extract(sources, Set.empty) + val result = HadMainDiagnosisExtractor(SimpleExtractorCodes.empty).extract(sources) // Then assertDSs(result, expected) @@ -54,7 +55,7 @@ class HadDiagnosesSuite extends SharedContext { import sqlCtx.implicits._ // Given - val associatedDiagnosis = Set("G9") + val associatedDiagnosis = SimpleExtractorCodes(List("G9")) val had = spark.read.parquet("src/test/resources/test-input/HAD.parquet") val sources = Sources(had = Some(had)) @@ -65,7 +66,7 @@ class HadDiagnosesSuite extends SharedContext { ).toDS // When - val result = HadAssociatedDiagnosisExtractor.extract(sources, associatedDiagnosis) + val result = HadAssociatedDiagnosisExtractor(associatedDiagnosis).extract(sources) // Then assertDSs(result, expected) diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/ImbDiagnosesSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/ImbDiagnosesSuite.scala new file mode 100644 index 00000000..c2898c85 --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/ImbDiagnosesSuite.scala @@ -0,0 +1,76 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.diagnoses + +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.events.ImbCcamDiagnosis +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.sources.Sources +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +class ImbDiagnosesSuite extends SharedContext { + + "extract" should "extract diagnosis events from raw data" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val imb = sqlContext.read.load("src/test/resources/test-input/IR_IMB_R.parquet") + val expected = Seq(ImbCcamDiagnosis("Patient_02", "C67", makeTS(2006, 3, 13), Some(makeTS(2016, 3, 13)))).toDS + + val sources = Sources(irImb = Some(imb)) + // When + val output = ImbCimDiagnosisExtractor(SimpleExtractorCodes(List("C67"))).extract(sources) + + // Then + assertDSs(expected, output) + } + + it should "extract all diagnosis events from raw data when an Empty codes is passed" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val imb = sqlContext.read.load("src/test/resources/test-input/IR_IMB_R.parquet") + val expected = Seq( + ImbCcamDiagnosis("Patient_02", "E11", makeTS(2006, 1, 25), Some(makeTS(2011, 1, 24))), + ImbCcamDiagnosis("Patient_02", "C67", makeTS(2006, 3, 13), Some(makeTS(2016, 3, 13))), + ImbCcamDiagnosis("Patient_02", "9999", makeTS(2006, 4, 25), Some(makeTS(2016, 4, 25))) + ).toDS + + val sources = Sources(irImb = Some(imb)) + // When + val output = ImbCimDiagnosisExtractor(SimpleExtractorCodes.empty).extract(sources) + + // Then + assertDSs(expected, output) + } + + it should "extract all diagnosis events from raw data when an Empty codes is passed even when ir_imb_f is null" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + //val imb = sqlContext.read.load("src/test/resources/test-input/IR_IMB_R_null.parquet") + val imb = Seq( + ("Patient_02", "CIM10", "E11", makeTS(2006, 1, 25), Some(makeTS(2011, 1, 24))), + ("Patient_02", "CIM10", "C67", makeTS(2006, 3, 13), Some(makeTS(1600, 1, 1))), + ("Patient_03", "CIM10", "C67", makeTS(2006, 3, 13), None), + ("Patient_02", "CIM10", "9999", makeTS(2006, 4, 25), Some(makeTS(2016, 4, 25))) + ).toDF("NUM_ENQ", "MED_NCL_IDT", "MED_MTF_COD", "IMB_ALD_DTD", "IMB_ALD_DTF") + + val expected = Seq( + ImbCcamDiagnosis("Patient_02", "E11", makeTS(2006, 1, 25), Some(makeTS(2011, 1, 24))), + ImbCcamDiagnosis("Patient_02", "C67", makeTS(2006, 3, 13), None), + ImbCcamDiagnosis("Patient_03", "C67", makeTS(2006, 3, 13), None), + ImbCcamDiagnosis("Patient_02", "9999", makeTS(2006, 4, 25), Some(makeTS(2016, 4, 25))) + ).toDS + + val sources = Sources(irImb = Some(imb)) + // When + val output = ImbCimDiagnosisExtractor(SimpleExtractorCodes.empty).extract(sources) + + // Then + assertDSs(expected, output) + } +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/McoDiagnosesSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/McoDiagnosesSuite.scala similarity index 76% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/McoDiagnosesSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/McoDiagnosesSuite.scala index 48656c1e..168806a7 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/McoDiagnosesSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/McoDiagnosesSuite.scala @@ -1,10 +1,10 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.diagnoses +package fr.polytechnique.cmap.cnam.etl.extractors.events.diagnoses import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.events.McoMainDiagnosis -import fr.polytechnique.cmap.cnam.etl.events._ +import fr.polytechnique.cmap.cnam.etl.events.{Diagnosis, Event, McoAssociatedDiagnosis, McoLinkedDiagnosis, McoMainDiagnosis} +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes import fr.polytechnique.cmap.cnam.etl.sources.Sources import fr.polytechnique.cmap.cnam.util.functions.makeTS @@ -15,7 +15,7 @@ class McoDiagnosesSuite extends SharedContext { import sqlCtx.implicits._ // Given - val dpCodes = Set("C67") + val dpCodes = SimpleExtractorCodes(List("C67")) val mco = spark.read.parquet("src/test/resources/test-input/MCO.parquet") val sources = Sources(mco = Some(mco)) @@ -30,7 +30,7 @@ class McoDiagnosesSuite extends SharedContext { // When - val result = McoMainDiagnosisExtractor.extract(sources, dpCodes) + val result = McoMainDiagnosisExtractor(dpCodes).extract(sources) // Then assertDSs(result, expected) @@ -41,7 +41,7 @@ class McoDiagnosesSuite extends SharedContext { import sqlCtx.implicits._ // Given - val linkedCodes = Set("E05", "E08") + val linkedCodes = SimpleExtractorCodes(List("E05", "E08")) val mco = spark.read.parquet("src/test/resources/test-input/MCO.parquet") val sources = Sources(mco = Some(mco)) @@ -51,7 +51,7 @@ class McoDiagnosesSuite extends SharedContext { ).toDS // When - val result = McoLinkedDiagnosisExtractor.extract(sources, linkedCodes) + val result = McoLinkedDiagnosisExtractor(linkedCodes).extract(sources) // Then assertDSs(result, expected) @@ -62,7 +62,7 @@ class McoDiagnosesSuite extends SharedContext { import sqlCtx.implicits._ // Given - val associatedDiagnosis = Set("C66") + val associatedDiagnosis = SimpleExtractorCodes(List("C66")) val mco = spark.read.parquet("src/test/resources/test-input/MCO.parquet") val sources = Sources(mco = Some(mco)) @@ -72,7 +72,7 @@ class McoDiagnosesSuite extends SharedContext { ).toDS // When - val result = McoAssociatedDiagnosisExtractor.extract(sources, associatedDiagnosis) + val result = McoAssociatedDiagnosisExtractor(associatedDiagnosis).extract(sources) // Then assertDSs(result, expected) diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/SsrDiagnosesSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/SsrDiagnosesSuite.scala similarity index 79% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/SsrDiagnosesSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/SsrDiagnosesSuite.scala index fcc8e363..dbe9c8a8 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/diagnoses/SsrDiagnosesSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/diagnoses/SsrDiagnosesSuite.scala @@ -1,7 +1,8 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.diagnoses +package fr.polytechnique.cmap.cnam.etl.extractors.events.diagnoses import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.events._ +import fr.polytechnique.cmap.cnam.etl.events.{Diagnosis, Event, SsrAssociatedDiagnosis, SsrLinkedDiagnosis, SsrMainDiagnosis, SsrTakingOverPurpose} +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes import fr.polytechnique.cmap.cnam.etl.sources.Sources import fr.polytechnique.cmap.cnam.util.functions.makeTS @@ -12,7 +13,7 @@ class SsrDiagnosesSuite extends SharedContext { import sqlCtx.implicits._ // Given - val dpCodes = Set("C66") + val dpCodes = SimpleExtractorCodes(List("C66")) val ssr = spark.read.parquet("src/test/resources/test-joined/SSR.parquet") val sources = Sources(ssr = Some(ssr)) @@ -24,7 +25,7 @@ class SsrDiagnosesSuite extends SharedContext { // When - val result = SsrMainDiagnosisExtractor.extract(sources, dpCodes) + val result = SsrMainDiagnosisExtractor(dpCodes).extract(sources) // Then assertDSs(result, expected) @@ -46,7 +47,7 @@ class SsrDiagnosesSuite extends SharedContext { // When - val result = SsrMainDiagnosisExtractor.extract(sources, Set.empty) + val result = SsrMainDiagnosisExtractor(SimpleExtractorCodes.empty).extract(sources) // Then assertDSs(result, expected) @@ -57,18 +58,18 @@ class SsrDiagnosesSuite extends SharedContext { import sqlCtx.implicits._ // Given - val linkedCodes = Set("C6") + val linkedCodes = SimpleExtractorCodes(List("C6")) val ssr = spark.read.parquet("src/test/resources/test-joined/SSR.parquet") val sources = Sources(ssr = Some(ssr)) val expected = Seq[Event[Diagnosis]]( SsrLinkedDiagnosis("Patient_02", "10000123_30000546_200_2019", "C68", makeTS(2019, 8, 11)), - SsrLinkedDiagnosis("Patient_02", "10000123_30000546_300_2019", "C66", makeTS(2019, 8, 11))//, + SsrLinkedDiagnosis("Patient_02", "10000123_30000546_300_2019", "C66", makeTS(2019, 8, 11)) //, //SsrMainDiagnosis("Patient_01", "10000123_30000801_100_2019", "C55", makeTS(2019, 10, 20)) ).toDS // When - val result = SsrLinkedDiagnosisExtractor.extract(sources, linkedCodes) + val result = SsrLinkedDiagnosisExtractor(linkedCodes).extract(sources) // Then assertDSs(result, expected) @@ -79,7 +80,7 @@ class SsrDiagnosesSuite extends SharedContext { import sqlCtx.implicits._ // Given - val associatedDiagnosis = Set("C6") + val associatedDiagnosis = SimpleExtractorCodes(List("C6")) val ssr = spark.read.parquet("src/test/resources/test-joined/SSR.parquet") val sources = Sources(ssr = Some(ssr)) @@ -89,7 +90,7 @@ class SsrDiagnosesSuite extends SharedContext { ).toDS // When - val result = SsrAssociatedDiagnosisExtractor.extract(sources, associatedDiagnosis) + val result = SsrAssociatedDiagnosisExtractor(associatedDiagnosis).extract(sources) // Then assertDSs(result, expected) @@ -100,7 +101,7 @@ class SsrDiagnosesSuite extends SharedContext { import sqlCtx.implicits._ // Given - val cim10Codes = Set("Z100") + val cim10Codes = SimpleExtractorCodes(List("Z100")) val ssr = spark.read.parquet("src/test/resources/test-joined/SSR.parquet") val expected = Seq[Event[Diagnosis]]( SsrTakingOverPurpose("Patient_02", "10000123_30000546_300_2019", "Z100", makeTS(2019, 8, 11)) @@ -108,7 +109,7 @@ class SsrDiagnosesSuite extends SharedContext { val input = Sources(ssr = Some(ssr)) // When - val result = SsrTakingOverPurposeExtractor.extract(input, cim10Codes) + val result = SsrTakingOverPurposeExtractor(cim10Codes).extract(input) // Then assertDSs(result, expected) @@ -128,7 +129,7 @@ class SsrDiagnosesSuite extends SharedContext { val input = Sources(ssr = Some(ssr)) // When - val result = SsrTakingOverPurposeExtractor.extract(input, Set.empty) + val result = SsrTakingOverPurposeExtractor(SimpleExtractorCodes.empty).extract(input) // Then assertDSs(result, expected) diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/DrugsExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/DrugsExtractorSuite.scala new file mode 100644 index 00000000..96a47716 --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/DrugsExtractorSuite.scala @@ -0,0 +1,836 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs + +import org.apache.spark.sql.Dataset +import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema +import org.apache.spark.sql.functions.lit +import org.apache.spark.sql.types.{StringType, StructField, StructType} +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.events.{Drug, Event} +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification._ +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.families.{Antidepresseurs, Antihypertenseurs, Hypnotiques, Neuroleptiques} +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level.{Cip13Level, MoleculeCombinationLevel, PharmacologicalLevel, TherapeuticLevel} +import fr.polytechnique.cmap.cnam.etl.sources.Sources +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +class DrugsExtractorSuite extends SharedContext { + + "extract" should "return all drugs when empty family list is passed" in { + + // Given + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + val inputDF = Seq( + ("patient1", Some("9111111111111"), Some( + makeTS( + 2014, + 5, + 1 + ) + ), "2014-09-01", "2014-07-17", "1", "17", "0", "01C673100", "1749"), + ("patient2", Some("3400935183644"), Some( + makeTS( + 2014, + 6, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673200", "1759"), + ("patient3", Some("3400935418487"), Some( + makeTS( + 2014, + 7, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673400", "1949"), + ("patient4", Some("3400935183644"), Some( + makeTS( + 2014, + 8, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673500", "1733"), + ("patient8", Some("3400936889651"), Some( + makeTS( + 2014, + 9, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673700", "1199") + ).toDF( + "NUM_ENQ", + "ER_PHA_F__PHA_PRS_C13", + "EXE_SOI_DTD", + "FLX_DIS_DTD", + "FLX_TRT_DTD", + "FLX_EMT_TYP", + "FLX_EMT_NUM", + "FLX_EMT_ORD", + "ORG_CLE_NUM", + "DCT_ORD_NUM" + ) + + val expected: Dataset[Event[Drug]] = Seq( + Drug( + "patient1", + "9111111111111", + 1, + "MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", + makeTS(2014, 5, 1) + ), + Drug( + "patient2", + "3400935183644", + 1, + "MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMyMDBfMTc1OQ==", + makeTS(2014, 6, 1) + ), + Drug( + "patient3", + "3400935418487", + 1, + "MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzM0MDBfMTk0OQ==", + makeTS(2014, 7, 1) + ), + Drug( + "patient4", + "3400935183644", + 1, + "MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzM1MDBfMTczMw==", + makeTS(2014, 8, 1) + ), + Drug( + "patient8", + "3400936889651", + 1, + "MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzM3MDBfMTE5OQ==", + makeTS(2014, 9, 1) + ) + ).toDS + + val source = new Sources( + irPha = Some( + Seq( + (Some("9111111111111"), "toto", "GC"), + (Some("3400935183644"), "toto", "GC"), + (Some("3400935418487"), "toto", "GC"), + (Some("3400936889651"), "toto", "GC") + ).toDF("PHA_CIP_C13", "PHA_ATC_C07", "PHA_CND_TOP") + .withColumn("molecule_combination", lit("")) + ), dcir = Some(inputDF) + ) + + val drugConf = DrugConfig(Cip13Level, List.empty) + // When + val result: Dataset[Event[Drug]] = DrugExtractor(drugConf).extract(source) + + // Then + assertDSs(result, expected) + } + + "extract" should "work correctly based on the DrugConfig Antidepresseurs" in { + + // Given + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + val inputDF = Seq( + ("patient1", Some("9111111111111"), Some( + makeTS( + 2014, + 5, + 1 + ) + ), "2014-09-01", "2014-07-17", "1", "17", "0", "01C673100", "1749"), + ("patient2", Some("3400935183644"), Some( + makeTS( + 2014, + 6, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673200", "1759"), + ("patient3", Some("3400935418487"), Some( + makeTS( + 2014, + 7, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673400", "1949"), + ("patient4", Some("3400935183644"), Some( + makeTS( + 2014, + 8, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673500", "1733"), + ("patient5", Some("3400936889651"), None, "2014-08-01", "2014-07-17", "1", "17", "0", "01C673700", "1199"), + ("patient6", None, Some(makeTS(2014, 9, 1)), "2014-08-01", "2014-07-11", "1", "17", "0", "01C673700", "1399"), + ("patient8", Some("3400936889651"), Some( + makeTS( + 2014, + 9, + 1 + ) + ), "2014-08-01", "2014-07-12", "1", "17", "0", "01C673700", "1699") + ).toDF( + "NUM_ENQ", + "ER_PHA_F__PHA_PRS_C13", + "EXE_SOI_DTD", + "FLX_DIS_DTD", + "FLX_TRT_DTD", + "FLX_EMT_TYP", + "FLX_EMT_NUM", + "FLX_EMT_ORD", + "ORG_CLE_NUM", + "DCT_ORD_NUM" + ) + + val expected: Dataset[Event[Drug]] = Seq( + Drug( + "patient2", + "Antidepresseurs", + 1, + "MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMyMDBfMTc1OQ==", + makeTS(2014, 6, 1) + ), + Drug( + "patient4", + "Antidepresseurs", + 1, + "MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzM1MDBfMTczMw==", + makeTS(2014, 8, 1) + ), + Drug( + "patient8", + "Antidepresseurs", + 1, + "MjAxNC0wOC0wMV8yMDE0LTA3LTEyXzFfMTdfMF8wMUM2NzM3MDBfMTY5OQ==", + makeTS(2014, 9, 1) + ) + ).toDS + + val source = new Sources( + irPha = Some( + Seq( + (Some("9111111111111"), "toto", "GC"), + (Some("3400935183644"), "toto", "GC"), + (Some("3400935418487"), "toto", "GC"), + (Some("3400936889651"), "toto", "GC") + ).toDF("PHA_CIP_C13", "PHA_ATC_C07", "PHA_CND_TOP") + .withColumn("molecule_combination", lit("")) + ), dcir = Some(inputDF) + ) + + val drugConf = DrugConfig(TherapeuticLevel, List(Antidepresseurs)) + + // When + val result = DrugExtractor(drugConf).extract(source) + + // Then + assertDSs(result, expected) + } + + "extract" should "return expected drug purchases with Therapeutic level of classification with class Neuroleptiques" in { + + // Given + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + val inputDF = Seq( + ("patient1", Some("9111111111111"), Some( + makeTS( + 2014, + 5, + 1 + ) + ), "2014-09-01", "2014-07-17", "1", "17", "0", "01C673100", "1749"), + ("patient2", Some("3400930023648"), Some( + makeTS( + 2014, + 6, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673200", "1759"), + ("patient3", Some("3400935183644"), Some( + makeTS( + 2014, + 7, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673400", "1949") + ).toDF( + "NUM_ENQ", + "ER_PHA_F__PHA_PRS_C13", + "EXE_SOI_DTD", + "FLX_DIS_DTD", + "FLX_TRT_DTD", + "FLX_EMT_TYP", + "FLX_EMT_NUM", + "FLX_EMT_ORD", + "ORG_CLE_NUM", + "DCT_ORD_NUM" + ) + + val expected: Dataset[Event[Drug]] = Seq( + Drug( + "patient2", + "Neuroleptiques", + 2, + "MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMyMDBfMTc1OQ==", + makeTS(2014, 6, 1) + ) + ).toDS + + val source = new Sources( + irPha = Some( + Seq( + (Some("9111111111111"), "toto", "NGC"), + (Some("3400935183644"), "toto", "NGC"), + (Some("3400930023648"), "toto", "NGC") + ).toDF("PHA_CIP_C13", "PHA_ATC_C07", "PHA_CND_TOP") + .withColumn("molecule_combination", lit("")) + ), dcir = Some(inputDF) + ) + + val drugConf = DrugConfig(TherapeuticLevel, List(Neuroleptiques)) + // When + val result = DrugExtractor(drugConf).extract(source) + + // Then + assertDSs(result, expected) + } + + "extract" should "return expected drug purchases with Therapeutic level of classification with class Hypnotiques" in { + + // Given + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + val inputDF = Seq( + ("patient1", Some("3400930081143"), Some( + makeTS( + 2014, + 6, + 1 + ) + ), "2014-09-01", "2014-07-17", "1", "17", "0", "01C673100", "1749"), + ("patient2", Some("3400936099777"), Some( + makeTS( + 2014, + 7, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673200", "1759") + ).toDF( + "NUM_ENQ", + "ER_PHA_F__PHA_PRS_C13", + "EXE_SOI_DTD", + "FLX_DIS_DTD", + "FLX_TRT_DTD", + "FLX_EMT_TYP", + "FLX_EMT_NUM", + "FLX_EMT_ORD", + "ORG_CLE_NUM", + "DCT_ORD_NUM" + ) + + val expected: Dataset[Event[Drug]] = Seq( + Drug( + "patient1", + "Hypnotiques", + 2, + "MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", + makeTS(2014, 6, 1) + ), + Drug( + "patient2", + "Hypnotiques", + 1, + "MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMyMDBfMTc1OQ==", + makeTS(2014, 7, 1) + ) + ).toDS + + val source = new Sources( + irPha = Some( + Seq( + (Some("3400930081143"), "toto", "NGC"), + (Some("3400936099777"), "toto", "GC") + ).toDF("PHA_CIP_C13", "PHA_ATC_C07", "PHA_CND_TOP") + .withColumn("molecule_combination", lit("")) + ), dcir = Some(inputDF) + ) + + val drugConf = DrugConfig(TherapeuticLevel, List(Hypnotiques)) + // When + val result = DrugExtractor(drugConf).extract(source) + + // Then + assertDSs(result, expected) + } + + "extract" should "return expected drug purchases with Therapeutic level of classification with class Antihypertenseurs" in { + + // Given + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + val inputDF = Seq( + ("patient1", Some("3400937354004"), Some( + makeTS( + 2014, + 6, + 1 + ) + ), "2014-09-01", "2014-07-17", "1", "17", "0", "01C673100", "1749"), + ("patient2", Some("3400936099777"), Some( + makeTS( + 2014, + 7, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673200", "1759") + ).toDF( + "NUM_ENQ", + "ER_PHA_F__PHA_PRS_C13", + "EXE_SOI_DTD", + "FLX_DIS_DTD", + "FLX_TRT_DTD", + "FLX_EMT_TYP", + "FLX_EMT_NUM", + "FLX_EMT_ORD", + "ORG_CLE_NUM", + "DCT_ORD_NUM" + ) + + val expected: Dataset[Event[Drug]] = Seq( + Drug( + "patient1", + "Antihypertenseurs", + 1, + "MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", + makeTS(2014, 6, 1) + ) + ).toDS + + + val source = new Sources( + irPha = Some( + Seq( + (Some("3400937354004"), "toto", "GC"), + (Some("3400936099777"), "toto", "GC") + ).toDF("PHA_CIP_C13", "PHA_ATC_C07", "PHA_CND_TOP") + .withColumn("molecule_combination", lit("")) + ), dcir = Some(inputDF) + ) + + val drugConf = DrugConfig(TherapeuticLevel, List(Antihypertenseurs)) + + // When + val result = DrugExtractor(drugConf).extract(source) + + // Then + assertDSs(result, expected) + } + + "extract" should "return expected drug purchases with Therapeutic level" in { + + // Given + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + val inputDF = Seq( + ("patient1", Some("9111111111111"), Some( + makeTS( + 2014, + 5, + 1 + ) + ), "2014-09-01", "2014-07-17", "1", "17", "0", "01C673100", "1749"), + ("patient2", Some("3400935183644"), Some( + makeTS( + 2014, + 6, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673200", "1759"), + ("patient3", Some("3400935418487"), Some( + makeTS( + 2014, + 7, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673400", "1949"), + ("patient4", Some("3400935183644"), Some( + makeTS( + 2014, + 8, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673500", "1733"), + ("patient5", Some("3400936889651"), None, "2014-08-01", "2014-07-17", "1", "17", "0", "01C673700", "1199"), + ("patient6", None, Some(makeTS(2014, 9, 1)), "2014-09-01", "2014-07-17", "1", "17", "0", "01C673100", "1959"), + ("patient8", Some("3400936889651"), Some( + makeTS( + 2014, + 9, + 1 + ) + ), "2014-09-01", "2014-07-17", "1", "17", "0", "01C673100", "2749"), + ("patient1", Some("9111111111111"), Some( + makeTS( + 2014, + 5, + 1 + ) + ), "2014-09-01", "2014-07-17", "1", "17", "0", "01C673100", "1749"), + ("patient2", Some("3400930023648"), Some( + makeTS( + 2014, + 6, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673200", "1759") + ).toDF( + "NUM_ENQ", + "ER_PHA_F__PHA_PRS_C13", + "EXE_SOI_DTD", + "FLX_DIS_DTD", + "FLX_TRT_DTD", + "FLX_EMT_TYP", + "FLX_EMT_NUM", + "FLX_EMT_ORD", + "ORG_CLE_NUM", + "DCT_ORD_NUM" + ) + + val expected: Dataset[Event[Drug]] = Seq( + Drug( + "patient2", + "Antidepresseurs", + 1, + "MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMyMDBfMTc1OQ==", + makeTS(2014, 6, 1) + ), + Drug( + "patient4", + "Antidepresseurs", + 1, + "MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzM1MDBfMTczMw==", + makeTS(2014, 8, 1) + ), + Drug( + "patient8", + "Antidepresseurs", + 1, + "MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMjc0OQ==", + makeTS(2014, 9, 1) + ), + Drug( + "patient2", + "Neuroleptiques", + 1, + "MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMyMDBfMTc1OQ==", + makeTS(2014, 6, 1) + ) + ).toDS + + val source = new Sources( + irPha = Some( + Seq( + (Some("9111111111111"), "toto", "GC"), + (Some("3400935183644"), "N06AA04", "GC"), + (Some("3400935418487"), "A10BB09", "GC"), + (Some("3400936889651"), "N06AB03", "GC"), + (Some("3400930023648"), "N05AX12", "GC") + ).toDF("PHA_CIP_C13", "PHA_ATC_C07", "PHA_CND_TOP") + .withColumn("molecule_combination", lit("")) + ), dcir = Some(inputDF) + ) + + val drugConfigAntidepresseurs: DrugClassConfig = Antidepresseurs + val drugConfigNeuroleptiques: DrugClassConfig = Neuroleptiques + val drugConf = DrugConfig(TherapeuticLevel, List(drugConfigAntidepresseurs, drugConfigNeuroleptiques)) + // When + val result = new DrugExtractor(drugConf).extract(source) + // Then + assertDSs(result, expected) + } + + "extract" should "return expected drug purchases with Pharmacological level" in { + + // Given + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + val inputDF = Seq( + ("patient1", Some("9111111111111"), Some( + makeTS( + 2014, + 5, + 1 + ) + ), "2014-09-01", "2014-07-17", "1", "17", "0", "01C673100", "1749"), + ("patient2", Some("3400935183644"), Some( + makeTS( + 2014, + 6, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673200", "1759"), + ("patient3", Some("3400935418487"), Some( + makeTS( + 2014, + 7, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673400", "1949"), + ("patient4", Some("3400935183644"), Some( + makeTS( + 2014, + 8, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673500", "1733"), + ("patient5", Some("3400936889651"), None, "2014-08-01", "2014-07-17", "1", "17", "0", "01C673700", "1199"), + ("patient6", None, Some(makeTS(2014, 9, 1)), "2014-09-01", "2014-07-17", "1", "17", "0", "01C673100", "1959"), + ("patient8", Some("3400936889651"), Some( + makeTS( + 2014, + 9, + 1 + ) + ), "2014-09-01", "2014-07-17", "1", "17", "0", "01C673100", "2749"), + ("patient1", Some("9111111111111"), Some( + makeTS( + 2014, + 5, + 1 + ) + ), "2014-09-01", "2014-07-17", "1", "17", "0", "01C673100", "1749"), + ("patient2", Some("3400930023648"), Some( + makeTS( + 2014, + 6, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673200", "1759") + ).toDF( + "NUM_ENQ", + "ER_PHA_F__PHA_PRS_C13", + "EXE_SOI_DTD", + "FLX_DIS_DTD", + "FLX_TRT_DTD", + "FLX_EMT_TYP", + "FLX_EMT_NUM", + "FLX_EMT_ORD", + "ORG_CLE_NUM", + "DCT_ORD_NUM" + ) + + val expected: Dataset[Event[Drug]] = Seq( + Drug( + "patient2", + "Antidepresseurs_Tricycliques", + 1, + "MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMyMDBfMTc1OQ==", + makeTS(2014, 6, 1) + ), + Drug( + "patient4", + "Antidepresseurs_Tricycliques", + 1, + "MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzM1MDBfMTczMw==", + makeTS(2014, 8, 1) + ), + Drug( + "patient8", + "Antidepresseurs_ISRS", + 1, + "MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMjc0OQ==", + makeTS(2014, 9, 1) + ), + Drug( + "patient2", + "Neuroleptiques_Autres_neuroleptiques", + 1, + "MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMyMDBfMTc1OQ==", + makeTS(2014, 6, 1) + ) + ).toDS + + val source = new Sources( + irPha = Some( + Seq( + (Some("9111111111111"), "toto", "GC"), + (Some("3400935183644"), "N06AA04", "GC"), + (Some("3400935418487"), "A10BB09", "GC"), + (Some("3400936889651"), "N06AB03", "GC"), + (Some("3400930023648"), "N05AX12", "GC") + ).toDF("PHA_CIP_C13", "PHA_ATC_C07", "PHA_CND_TOP") + .withColumn("molecule_combination", lit("")) + ), dcir = Some(inputDF) + ) + + val drugConfigAntidepresseurs: DrugClassConfig = Antidepresseurs + val drugConfigNeuroleptiques: DrugClassConfig = Neuroleptiques + val drugConf = DrugConfig(PharmacologicalLevel, List(drugConfigAntidepresseurs, drugConfigNeuroleptiques)) + // When + val result = DrugExtractor(drugConf).extract(source) + + // Then + assertDSs(result, expected) + } + + "extract" should "return expected drug purchases with MoleculeCombination level" in { + + // Given + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + val inputDF = Seq( + ("patient1", Some("9111111111111"), Some( + makeTS( + 2014, + 5, + 1 + ) + ), "2014-09-01", "2014-07-17", "1", "17", "0", "01C673100", "1749"), + ("patient2", Some("3400935183644"), Some( + makeTS( + 2014, + 6, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673200", "1759"), + ("patient3", Some("3400935418487"), Some( + makeTS( + 2014, + 7, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673400", "1949"), + ("patient4", Some("3400935183644"), Some( + makeTS( + 2014, + 8, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673500", "1733"), + ("patient5", Some("3400936889651"), None, "2014-08-01", "2014-07-17", "1", "17", "0", "01C673700", "1199"), + ("patient6", None, Some(makeTS(2014, 9, 1)), "2014-09-01", "2014-07-17", "1", "17", "0", "01C673100", "1959"), + ("patient8", Some("3400936889651"), Some( + makeTS( + 2014, + 9, + 1 + ) + ), "2014-09-01", "2014-07-17", "1", "17", "0", "01C673100", "2749"), + ("patient1", Some("9111111111111"), Some( + makeTS( + 2014, + 5, + 1 + ) + ), "2014-09-01", "2014-07-17", "1", "17", "0", "01C673100", "1749"), + ("patient2", Some("3400930023648"), Some( + makeTS( + 2014, + 6, + 1 + ) + ), "2014-08-01", "2014-07-17", "1", "17", "0", "01C673200", "1759") + ).toDF( + "NUM_ENQ", + "ER_PHA_F__PHA_PRS_C13", + "EXE_SOI_DTD", + "FLX_DIS_DTD", + "FLX_TRT_DTD", + "FLX_EMT_TYP", + "FLX_EMT_NUM", + "FLX_EMT_ORD", + "ORG_CLE_NUM", + "DCT_ORD_NUM" + ) + + val expected: Dataset[Event[Drug]] = Seq( + Drug( + "patient2", + "N06AA04", + 1, + "MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMyMDBfMTc1OQ==", + makeTS(2014, 6, 1) + ), + Drug( + "patient4", + "N06AA04", + 1, + "MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzM1MDBfMTczMw==", + makeTS(2014, 8, 1) + ), + Drug( + "patient8", + "DEXTROPROPOXYPHENE_PARACETAMOL_CAFEINE", + 1, + "MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMjc0OQ==", + makeTS(2014, 9, 1) + ), + Drug( + "patient2", + "INSULINE LISPRO (PROTAMINE)", + 1, + "MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMyMDBfMTc1OQ==", + makeTS(2014, 6, 1) + ) + ).toDS + + val irPha = Seq( + (Some("9111111111111"), "toto", "toto", "GC"), + (Some("3400935183644"), "N06AA04", "N06AA04", "GC"), + (Some("3400935418487"), "A10BB09", "A10BB09", "GC"), + (Some("3400936889651"), "N06AB03", "DEXTROPROPOXYPHENE_PARACETAMOL_CAFEINE", "GC"), + (Some("3400930023648"), "N05AX12", "INSULINE LISPRO (PROTAMINE)", "GC") + ).toDF("PHA_CIP_C13", "PHA_ATC_C07", "molecule_combination", "PHA_CND_TOP") + + val source = new Sources(irPha = Some(irPha), dcir = Some(inputDF)) + + val drugConfigAntidepresseurs: DrugClassConfig = Antidepresseurs + val drugConfigNeuroleptiques: DrugClassConfig = Neuroleptiques + val drugConf = DrugConfig(MoleculeCombinationLevel, List(drugConfigAntidepresseurs, drugConfigNeuroleptiques)) + // When + val result = DrugExtractor(drugConf).extract(source) + + // Then + assertDSs(result, expected) + } + + "extractGroupId" should "return the group ID for done values" in { + // Given + val schema = StructType( + Seq( + StructField("FLX_DIS_DTD", StringType), + StructField("FLX_TRT_DTD", StringType), + StructField("FLX_EMT_TYP", StringType), + StructField("FLX_EMT_NUM", StringType), + StructField("FLX_EMT_ORD", StringType), + StructField("ORG_CLE_NUM", StringType), + StructField("DCT_ORD_NUM", StringType) + ) + ) + + val values = Array[Any]("2014-08-01", "2014-08-17", "1", "17", "0", "01C673000", "1759") + val r = new GenericRowWithSchema(values, schema) + val expected = "MjAxNC0wOC0wMV8yMDE0LTA4LTE3XzFfMTdfMF8wMUM2NzMwMDBfMTc1OQ==" + + val drugConf = DrugConfig(Cip13Level, List.empty) + + // When + val result = DrugExtractor(drugConf).extractGroupId(r) + + // Then + assert(result == expected) + } + + +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/PharmacologicalClassConfigSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/PharmacologicalClassConfigSuite.scala similarity index 92% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/PharmacologicalClassConfigSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/PharmacologicalClassConfigSuite.scala index 73159c6b..07dc5a81 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/PharmacologicalClassConfigSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/PharmacologicalClassConfigSuite.scala @@ -1,10 +1,10 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.drugs +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.PharmacologicalClassConfig -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.families.Antidepresseurs +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.PharmacologicalClassConfig +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.families.Antidepresseurs class PharmacologicalClassConfigSuite extends SharedContext { diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/Cip13LevelSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/Cip13LevelSuite.scala similarity index 90% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/Cip13LevelSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/Cip13LevelSuite.scala index a83ee3e6..15922c41 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/Cip13LevelSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/Cip13LevelSuite.scala @@ -1,12 +1,12 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.drugs.level +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level import org.mockito.Mockito import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema import org.apache.spark.sql.types.{StringType, StructField, StructType} import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} class Cip13LevelSuite extends SharedContext { diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/DrugClassficationLevelSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/DrugClassficationLevelSuite.scala similarity index 90% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/DrugClassficationLevelSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/DrugClassficationLevelSuite.scala index d50dc0a3..5e4196e1 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/DrugClassficationLevelSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/DrugClassficationLevelSuite.scala @@ -1,6 +1,6 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.drugs.level +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level import org.scalatest.matchers.should.Matchers.{a, convertToAnyShouldWrapper} import fr.polytechnique.cmap.cnam.SharedContext diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/MoleculeCombinationSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/MoleculeCombinationSuite.scala similarity index 90% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/MoleculeCombinationSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/MoleculeCombinationSuite.scala index 70c8c1e7..e5f80f4a 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/MoleculeCombinationSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/MoleculeCombinationSuite.scala @@ -1,12 +1,12 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.drugs.level +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level import org.mockito.Mockito import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema import org.apache.spark.sql.types.{StringType, StructField, StructType} import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} class MoleculeCombinationSuite extends SharedContext { diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/PharmacologicalSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/PharmacologicalSuite.scala similarity index 92% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/PharmacologicalSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/PharmacologicalSuite.scala index fdcc8a64..7b407607 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/PharmacologicalSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/PharmacologicalSuite.scala @@ -1,11 +1,11 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.drugs.level +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema import org.apache.spark.sql.types.{StringType, StructField, StructType} import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} class PharmacologicalSuite extends SharedContext { diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/TherapeuticSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/TherapeuticSuite.scala similarity index 92% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/TherapeuticSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/TherapeuticSuite.scala index 311d8793..9ad52ff5 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/drugs/level/TherapeuticSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/drugs/level/TherapeuticSuite.scala @@ -1,12 +1,12 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.drugs.level +package fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level import org.mockito.Mockito import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema import org.apache.spark.sql.types.{StringType, StructField, StructType} import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.classification.{DrugClassConfig, PharmacologicalClassConfig} class TherapeuticSuite extends SharedContext { diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/hospitalstays/HadHospitalStayExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/HadHospitalStayExtractorSuite.scala similarity index 93% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/hospitalstays/HadHospitalStayExtractorSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/HadHospitalStayExtractorSuite.scala index 336f10ba..d258e695 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/hospitalstays/HadHospitalStayExtractorSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/HadHospitalStayExtractorSuite.scala @@ -1,10 +1,10 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.hospitalstays +package fr.polytechnique.cmap.cnam.etl.extractors.events.hospitalstays +import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.SharedContext import fr.polytechnique.cmap.cnam.etl.events.{Event, HadHospitalStay, HospitalStay} import fr.polytechnique.cmap.cnam.etl.sources.Sources import fr.polytechnique.cmap.cnam.util.functions.makeTS -import org.apache.spark.sql.Dataset class HadHospitalStayExtractorSuite extends SharedContext { @@ -23,7 +23,7 @@ class HadHospitalStayExtractorSuite extends SharedContext { ).toDS() //When - val result: Dataset[Event[HospitalStay]] = HadHospitalStaysExtractor.extract(sources, Set.empty) + val result: Dataset[Event[HospitalStay]] = HadHospitalStaysExtractor.extract(sources) //Then assertDSs(expected, result) @@ -44,7 +44,7 @@ class HadHospitalStayExtractorSuite extends SharedContext { ).toDS() //When - val result: Dataset[Event[HospitalStay]] = HadHospitalStaysExtractor.extract(sources, Set("Test")) + val result: Dataset[Event[HospitalStay]] = HadHospitalStaysExtractor.extract(sources) //Then assertDSs(expected, result) diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/McoHospitalStayExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/McoHospitalStayExtractorSuite.scala new file mode 100644 index 00000000..bc22528b --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/McoHospitalStayExtractorSuite.scala @@ -0,0 +1,84 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.hospitalstays + +import org.apache.spark.sql.Dataset +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.events.{Event, HospitalStay, McoHospitalStay} +import fr.polytechnique.cmap.cnam.etl.sources.Sources +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +class McoHospitalStayExtractorSuite extends SharedContext { + + "extract" should "return the hospital stays from mco sources" in { + //Given + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + val mco = spark.read.parquet("src/test/resources/test-input/MCO.parquet") + val sources = Sources(mco = Some(mco)) + + val expected: Dataset[Event[HospitalStay]] = Seq( + McoHospitalStay("Patient_02", "10000123_20000123_2007", 8.0D, makeTS(2007, 2, 1), makeTS(2007, 2, 10)), + McoHospitalStay("Patient_02", "10000123_20000345_2007", 8.0D, makeTS(2007, 2, 1), makeTS(2007, 2, 10)), + McoHospitalStay("Patient_02", "10000123_30000546_2008", 8.0D, makeTS(2008, 3, 1), makeTS(2008, 3, 10)), + McoHospitalStay("Patient_02", "10000123_30000852_2008", 8.0D, makeTS(2008, 3, 1), makeTS(2008, 3, 10)), + McoHospitalStay("Patient_02", "10000123_10000987_2006", 8.0D, makeTS(2006, 1, 1), makeTS(2006, 1, 10)), + McoHospitalStay("Patient_02", "10000123_10000543_2006", -1.0D, makeTS(2006, 1, 1), makeTS(2006, 1, 10)) + ).toDS() + + //When + val result: Dataset[Event[HospitalStay]] = McoHospitalStaysExtractor.extract(sources) + + //Then + assertDSs(expected, result) + } + + "extract" should "return the hospital stays from mco sources with non empty set codes" in { + //Given + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + val mco = spark.read.parquet("src/test/resources/test-input/MCO.parquet") + val sources = Sources(mco = Some(mco)) + + val expected: Dataset[Event[HospitalStay]] = Seq( + McoHospitalStay("Patient_02", "10000123_20000123_2007", 8.0D, makeTS(2007, 2, 1), makeTS(2007, 2, 10)), + McoHospitalStay("Patient_02", "10000123_20000345_2007", 8.0D, makeTS(2007, 2, 1), makeTS(2007, 2, 10)), + McoHospitalStay("Patient_02", "10000123_30000546_2008", 8.0D, makeTS(2008, 3, 1), makeTS(2008, 3, 10)), + McoHospitalStay("Patient_02", "10000123_30000852_2008", 8.0D, makeTS(2008, 3, 1), makeTS(2008, 3, 10)), + McoHospitalStay("Patient_02", "10000123_10000987_2006", 8.0D, makeTS(2006, 1, 1), makeTS(2006, 1, 10)), + McoHospitalStay("Patient_02", "10000123_10000543_2006", -1.0D, makeTS(2006, 1, 1), makeTS(2006, 1, 10)) + ).toDS() + + //When + val result: Dataset[Event[HospitalStay]] = McoHospitalStaysExtractor.extract(sources) + + //Then + assertDSs(expected, result) + } + + "extractWeight" should "calculate correct weight from mco sources" in { + //Given + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + val df = Seq( + ("Patient_02", "10000123", "20000123", "2007", makeTS(2007, 1, 1), makeTS(2007, 1, 10), "8", "5"), + ("Patient_02", "10000123", "20000345", "2007", makeTS(2007, 2, 1), makeTS(2007, 2, 10), "8", "5"), + ("Patient_02", "10000123", "20000546", "2007", makeTS(2007, 3, 1), makeTS(2007, 3, 10), "8", "R"), + ("Patient_02", "10000123", "20000852", "2007", makeTS(2007, 4, 1), makeTS(2007, 4, 10), "8", null), + ("Patient_02", "10000123", "20000987", "2007", makeTS(2007, 5, 1), makeTS(2007, 5, 10), null, null) + ).toDF("NUM_ENQ", "ETA_NUM", "RSA_NUM", "SOR_ANN", "EXE_SOI_DTD", "EXE_SOI_DTF", "MCO_B__ENT_MOD", "MCO_B__ENT_PRV") + + val expected = Seq(8.5D, 8.5D, 8.8D, 8.0D, -1.0D).toDS() + + //When + val result = df.map(r => McoHospitalStaysExtractor.extractWeight(r)) + + //Then + assertDSs(expected, result) + + } + +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/McoceEmergenciesExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/McoceEmergenciesExtractorSuite.scala new file mode 100644 index 00000000..7e33563a --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/McoceEmergenciesExtractorSuite.scala @@ -0,0 +1,54 @@ +package fr.polytechnique.cmap.cnam.etl.extractors.events.hospitalstays + +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.events.McoceEmergency +import fr.polytechnique.cmap.cnam.etl.sources.Sources +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +class McoceEmergenciesExtractorSuite extends SharedContext { + + "extract" should "return the hospital stays(emergencies) from mcoce sources" in { + //Given + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + val df = Seq( + ("20041", "830100525", "00030885", "2012", makeTS(2012, 4, 21), makeTS(2012, 4, 21), "ATU", 2012), + ("20041", "830100525", "00032716", "2012", makeTS(2012, 4, 28), makeTS(2012, 4, 29), "ATU", 2012), + ("20041", "830100525", "00032738", "2012", makeTS(2012, 4, 29), makeTS(2012, 4, 29), "ATU", 2012), + ("20041", "830100525", "00032038", "2013", makeTS(2013, 4, 29), makeTS(2013, 4, 29), "FTN", 2013), + ("200410", "190000059", "00044158", null, makeTS(2010, 3, 5), makeTS(2010, 3, 5), null, 2010), + ("200410", "190000059", "00027825", null, makeTS(2011, 5, 13), makeTS(2011, 5, 13), null, 2011), + ("200410", "190000059", "00020161", null, makeTS(2012, 4, 10), makeTS(2012, 4, 10), null, 2012), + ("200410", "190000059", "00022621", null, makeTS(2014, 4, 18), makeTS(2014, 5, 18), null, 2014), + ("2004838055", "680000395", "00018597", "2010", makeTS(2010, 7, 11), makeTS(2010, 7, 11), "ATU F", 2010), + ("2006191920", "680000395", "00009656", "2013", makeTS(2013, 9, 24), makeTS(2013, 9, 24), "ATU N", 2013) + ) + .toDF( + "NUM_ENQ", + "ETA_NUM", + "SEQ_NUM", + "MCO_FBSTC__SOR_ANN", + "EXE_SOI_DTD", + "EXE_SOI_DTF", + "MCO_FBSTC__ACT_COD", + "year" + ) + + val sources = Sources(mcoCe = Some(df)) + + val expected = Seq( + McoceEmergency("20041", "830100525_00032716_2012", makeTS(2012, 4, 28), makeTS(2012, 4, 29)), + McoceEmergency("20041", "830100525_00030885_2012", makeTS(2012, 4, 21), makeTS(2012, 4, 21)), + McoceEmergency("20041", "830100525_00032738_2012", makeTS(2012, 4, 29), makeTS(2012, 4, 29)), + McoceEmergency("2004838055", "680000395_00018597_2010", makeTS(2010, 7, 11), makeTS(2010, 7, 11)), + McoceEmergency("2006191920", "680000395_00009656_2013", makeTS(2013, 9, 24), makeTS(2013, 9, 24)) + ).toDS() + + val result = McoceEmergenciesExtractor.extract(sources) + + assertDSs(expected, result) + + } + +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/hospitalstays/SSrHospitalStayExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/SSrHospitalStayExtractorSuite.scala similarity index 92% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/hospitalstays/SSrHospitalStayExtractorSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/SSrHospitalStayExtractorSuite.scala index 6f8f52e4..0e49f767 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/hospitalstays/SSrHospitalStayExtractorSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/hospitalstays/SSrHospitalStayExtractorSuite.scala @@ -1,10 +1,10 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.hospitalstays +package fr.polytechnique.cmap.cnam.etl.extractors.events.hospitalstays +import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.SharedContext import fr.polytechnique.cmap.cnam.etl.events.{Event, HospitalStay, SsrHospitalStay} import fr.polytechnique.cmap.cnam.etl.sources.Sources import fr.polytechnique.cmap.cnam.util.functions.makeTS -import org.apache.spark.sql.Dataset class SSrHospitalStayExtractorSuite extends SharedContext { @@ -22,7 +22,7 @@ class SSrHospitalStayExtractorSuite extends SharedContext { ).toDS() //When - val result: Dataset[Event[HospitalStay]] = SsrHospitalStaysExtractor.extract(sources, Set.empty) + val result: Dataset[Event[HospitalStay]] = SsrHospitalStaysExtractor.extract(sources) //Then assertDSs(expected, result) @@ -42,7 +42,7 @@ class SSrHospitalStayExtractorSuite extends SharedContext { ).toDS() //When - val result: Dataset[Event[HospitalStay]] = SsrHospitalStaysExtractor.extract(sources, Set("Test")) + val result: Dataset[Event[HospitalStay]] = SsrHospitalStaysExtractor.extract(sources) //Then assertDSs(expected, result) diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/molecules/DcirMoleculePurchasesSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/molecules/DcirMoleculePurchasesSuite.scala similarity index 98% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/molecules/DcirMoleculePurchasesSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/molecules/DcirMoleculePurchasesSuite.scala index 1d26eb93..8a7c0172 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/molecules/DcirMoleculePurchasesSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/molecules/DcirMoleculePurchasesSuite.scala @@ -1,6 +1,6 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.molecules +package fr.polytechnique.cmap.cnam.etl.extractors.events.molecules import org.apache.spark.sql.DataFrame import fr.polytechnique.cmap.cnam.SharedContext @@ -146,7 +146,7 @@ class DcirMoleculePurchasesSuite extends SharedContext { //when val extractor = new DcirMoleculePurchases(config) - val result = extractor.getInput(sources).filter(extractor.isInStudy(config.drugClasses.toSet) _).distinct() + val result = extractor.getInput(sources).filter(extractor.isInStudy _).distinct() //then assertDFs(result, expected) @@ -217,7 +217,7 @@ class DcirMoleculePurchasesSuite extends SharedContext { val expected = Seq(Molecule("patient", "SULFONYLUREA", 2700.0, makeTS(2006, 1, 15))).toDS() // When - val result = new DcirMoleculePurchases(config).extract(sources, config.drugClasses.toSet) + val result = new DcirMoleculePurchases(config).extract(sources) // Then assertDSs(result, expected) } diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/molecules/MoleculePurchasesSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/molecules/MoleculePurchasesSuite.scala similarity index 84% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/molecules/MoleculePurchasesSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/molecules/MoleculePurchasesSuite.scala index 6f2f012d..b10fa674 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/molecules/MoleculePurchasesSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/molecules/MoleculePurchasesSuite.scala @@ -1,6 +1,6 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.molecules +package fr.polytechnique.cmap.cnam.etl.extractors.events.molecules import org.apache.spark.sql.DataFrame import org.apache.spark.sql.functions._ @@ -29,12 +29,12 @@ class MoleculePurchasesSuite extends SharedContext { irPha = Some(irPha), dosages = Some(dosages) ) - val expected = new DcirMoleculePurchases(config).extract(sources, config.drugClasses.toSet).toDF + val expected = new DcirMoleculePurchases(config).extract(sources) // When - val result = new MoleculePurchases(config).extract(sources).toDF + val result = new MoleculePurchases(config).extract(sources) // Then - assertDFs(result, expected) + assertDSs(result, expected) } } \ No newline at end of file diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/ngapacts/DcirNgapActsExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/ngapacts/DcirNgapActsExtractorSuite.scala new file mode 100644 index 00000000..9e2d116b --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/ngapacts/DcirNgapActsExtractorSuite.scala @@ -0,0 +1,82 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.ngapacts + +import org.apache.spark.sql.DataFrame +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.events.{DcirNgapAct, Event, NgapAct} +import fr.polytechnique.cmap.cnam.etl.sources.Sources +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +class DcirNgapActsExtractorSuite extends SharedContext { + + "extract" should "extract ngap acts events from raw data with a ngapClass based on key letter B2 and coefficient" in { + + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val dcir: DataFrame = sqlCtx.read.load("src/test/resources/test-input/DCIR.parquet") + val irNat: DataFrame = sqlCtx.read.load("src/test/resources/value_tables/IR_NAT_V.parquet") + + val source = new Sources(dcir = Some(dcir), irNat = Some(irNat)) + + val expected = Seq[Event[NgapAct]]( + DcirNgapAct("Patient_01", "unknown_source", "1111_C_0.42", 0.0, makeTS(2006, 2, 1)), + DcirNgapAct("Patient_01", "liberal", "1111_C_0.42", 0.0, makeTS(2006, 1, 15)), + DcirNgapAct("Patient_01", "liberal", "1111_C_0.42", 0.0, makeTS(2006, 1, 30)) + ).toDS + + val ngapConf = NgapActConfig( + actsCategories = List( + new NgapWithNatClassConfig( + ngapKeyLetters = Seq("D"), + ngapCoefficients = Seq("0.45"), + ngapPrsNatRefs = Seq("1111") + ) + ) + ) + + // When + val result = DcirNgapActExtractor(ngapConf).extract(source) + + // Then + assertDSs(result, expected) + } + + + "extract from prsNatRef" should "extract ngap acts events from raw data with a ngapClass based on prsNatRef" in { + + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val dcir: DataFrame = sqlCtx.read.load("src/test/resources/test-input/DCIR.parquet") + val irNat: DataFrame = sqlCtx.read.load("src/test/resources/value_tables/IR_NAT_V.parquet") + + val source = new Sources(dcir = Some(dcir), irNat = Some(irNat)) + + val expected = Seq[Event[NgapAct]]( + DcirNgapAct("Patient_01", "unknown_source", "1111_C_0.42", 0.0, makeTS(2006, 2, 1)), + DcirNgapAct("Patient_01", "liberal", "1111_C_0.42", 0.0, makeTS(2006, 1, 15)), + DcirNgapAct("Patient_01", "liberal", "1111_C_0.42", 0.0, makeTS(2006, 1, 30)) + ).toDS + + val ngapConf = NgapActConfig( + actsCategories = List( + new NgapWithNatClassConfig( + ngapKeyLetters = Seq("D"), + ngapCoefficients = Seq("0.45"), + ngapPrsNatRefs = Seq("1111") + ) + ) + ) + // When + val result = DcirNgapActExtractor(ngapConf).extract(source) + + // Then + assertDSs(result, expected) + } + +} + diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/ngapacts/McoNgapActsExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/ngapacts/McoNgapActsExtractorSuite.scala new file mode 100644 index 00000000..ef243a2b --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/ngapacts/McoNgapActsExtractorSuite.scala @@ -0,0 +1,114 @@ +package fr.polytechnique.cmap.cnam.etl.extractors.events.ngapacts + +import org.apache.spark.sql.DataFrame +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.events.{Event, McoCeFbstcNgapAct, McoCeFcstcNgapAct, NgapAct} +import fr.polytechnique.cmap.cnam.etl.sources.Sources +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +class McoNgapActsExtractorSuite extends SharedContext { + + "extract" should "extract ngap acts events from raw data with a ngapClass based on key letter B2 and coefficient" in { + + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val mcoCe: DataFrame = sqlCtx.read.load("src/test/resources/test-input/MCO_CE.parquet") + val source = new Sources(mcoCe = Some(mcoCe)) + + val expected = Seq[Event[NgapAct]]( + McoCeFbstcNgapAct("200410", "190000059_00022621_2014", "PmsiCe_ABG_42.0", makeTS(2014, 4, 18)) + ).toDS + + val ngapConf = NgapActConfig( + actsCategories = List( + NgapActClassConfig( + ngapKeyLetters = Seq("ABG"), + ngapCoefficients = Seq("42.0") + ) + ) + ) + // When + val result = McoCeFbstcNgapActExtractor(ngapConf).extract(source) + // Then + assertDSs(result, expected) + } + + + "extract from prsNatRef" should "extract ngap acts events from raw data with a ngapKeyLetter only" in { + + val sqlCtx = sqlContext + import sqlCtx.implicits._ + // Given + val mcoCe: DataFrame = sqlCtx.read.load("src/test/resources/test-input/MCO_CE.parquet") + val source = new Sources(mcoCe = Some(mcoCe)) + + val expected = Seq[Event[NgapAct]]( + McoCeFbstcNgapAct("2004100010", "390780146_00064268_2014", "PmsiCe_ABC_1.0", makeTS(2014, 7, 18)) + ).toDS + + val ngapConf = NgapActConfig( + actsCategories = List( + NgapActClassConfig( + ngapKeyLetters = Seq("ABC"), + ngapCoefficients = Seq.empty + ) + ) + ) + // When + val result = McoCeFbstcNgapActExtractor(ngapConf).extract(source) + // Then + assertDSs(result, expected) + } + + "extract from prsNatRef" should "extract all ngap acts events from raw MCO_FBSTC data " in { + + val sqlCtx = sqlContext + import sqlCtx.implicits._ + // Given + val mcoCe: DataFrame = sqlCtx.read.load("src/test/resources/test-input/MCO_CE.parquet") + val source = new Sources(mcoCe = Some(mcoCe)) + + val expected = Seq[Event[NgapAct]]( + McoCeFbstcNgapAct("2004100010", "390780146_00064268_2014", "PmsiCe_ABC_1.0", makeTS(2014, 7, 18)), + McoCeFbstcNgapAct("200410", "190000059_00022621_2014", "PmsiCe_ABG_42.0", makeTS(2014, 4, 18)), + McoCeFbstcNgapAct("2004100010", "390780146_00114237_2014", "PmsiCe_ACO_0", makeTS(2014, 12, 12)) + ).toDS + + val ngapConf = NgapActConfig( + actsCategories = List.empty + ) + // When + val result = McoCeFbstcNgapActExtractor(ngapConf).extract(source) + + // Then + assertDSs(result, expected) + } + + "extract from prsNatRef" should "extract all ngap acts events from raw MCO_FCSTC data " in { + + val sqlCtx = sqlContext + import sqlCtx.implicits._ + // Given + val mcoCe: DataFrame = sqlCtx.read.load("src/test/resources/test-input/MCO_CE.parquet") + val source = new Sources(mcoCe = Some(mcoCe)) + + val expected = Seq[Event[NgapAct]]( + McoCeFcstcNgapAct("2004100010", "390780146_00026744_2014", "PmsiCe_A F_126936.43", makeTS(2014, 4, 4)), + McoCeFcstcNgapAct("2004100010", "390780146_00114237_2014", "PmsiCe_ADE_802770.97", makeTS(2014, 12, 12)), + McoCeFcstcNgapAct("2004100010", "710780214_00000130_2014", "PmsiCe_ADC_420416.2", makeTS(2014, 4, 15)) + ).toDS + + val ngapConf = NgapActConfig( + actsCategories = List.empty + ) + // When + val result = McoCeFcstcNgapActExtractor(ngapConf).extract(source) + + // Then + assertDSs(result, expected) + } + +} + diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/prestations/PractitionerClaimSpecialityExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/prestations/PractitionerClaimSpecialityExtractorSuite.scala new file mode 100644 index 00000000..fc12ddac --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/prestations/PractitionerClaimSpecialityExtractorSuite.scala @@ -0,0 +1,174 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.events.prestations + +import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema +import org.apache.spark.sql.types.{StringType, StructField, StructType} +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.events.{Event, McoCeFbstcMedicalPractitionerClaim, McoCeFcstcMedicalPractitionerClaim, MedicalPractitionerClaim, NonMedicalPractitionerClaim, PractitionerClaimSpeciality} +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.sources.Sources +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +class PractitionerClaimSpecialityExtractorSuite extends SharedContext { + + "extract" should "extract health care related services provided by medical practitioner raw data" in { + + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val medicalSpeCodes = SimpleExtractorCodes(List("42")) + val input = spark.read.parquet("src/test/resources/test-input/DCIR_w_BIO.parquet") + val sources = Sources(dcir = Some(input)) + + val expected = Seq[Event[PractitionerClaimSpeciality]]( + MedicalPractitionerClaim("Patient_01", "A10000001", "42", makeTS(2006, 2, 1)), + MedicalPractitionerClaim("Patient_01", "A10000001", "42", makeTS(2006, 1, 15)), + MedicalPractitionerClaim("Patient_01", "A10000001", "42", makeTS(2006, 1, 30)) + ).toDS + + + // When + val result = MedicalPractitionerClaimExtractor(medicalSpeCodes).extract(sources) + + // Then + assertDSs(result, expected) + } + + + "extract" should "extract health care related services provided by non medical practitioner raw data" in { + + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val nonMedicalSpeCodes = SimpleExtractorCodes(List("42")) + val input = spark.read.parquet("src/test/resources/test-input/DCIR_w_BIO.parquet") + val sources = Sources(dcir = Some(input)) + + val expected = Seq[Event[PractitionerClaimSpeciality]]( + NonMedicalPractitionerClaim("Patient_01", "A10000001", "42", makeTS(2006, 2, 1)), + NonMedicalPractitionerClaim("Patient_01", "A10000001", "42", makeTS(2006, 1, 15)), + NonMedicalPractitionerClaim("Patient_01", "A10000001", "42", makeTS(2006, 1, 30)), + NonMedicalPractitionerClaim("Patient_02", "A10000005", "42", makeTS(2006, 1, 15)), + NonMedicalPractitionerClaim("Patient_02", "A10000005", "42", makeTS(2006, 1, 30)) + ).toDS + + + // When + val result = NonMedicalPractitionerClaimExtractor(nonMedicalSpeCodes).extract(sources) + + // Then + assertDSs(result, expected) + } + + "extractGroupId" should "return the health care practitioner ID" in { + + // Given + val schema = StructType( + StructField("PFS_EXE_NUM", StringType) :: Nil + ) + val values = Array[Any]("A10000001") + val row = new GenericRowWithSchema(values, schema) + val expected = "A10000001" + + // When + val result = NonMedicalPractitionerClaimExtractor(SimpleExtractorCodes.empty).extractGroupId(row) + + // Then + assert(result == expected) + } + + + "extract" should "discard providers with a specialty of 0" in { + + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val input = spark.read.parquet("src/test/resources/test-input/DCIR_w_BIO.parquet") + val sources = Sources(dcir = Some(input)) + + val expected = Seq[Event[PractitionerClaimSpeciality]]( + MedicalPractitionerClaim("Patient_01", "A10000001", "42", makeTS(2006, 2, 1)), + MedicalPractitionerClaim("Patient_01", "A10000001", "42", makeTS(2006, 1, 15)), + MedicalPractitionerClaim("Patient_01", "A10000001", "42", makeTS(2006, 1, 30)) + ).toDS + + + // When + val result = MedicalPractitionerClaimExtractor(SimpleExtractorCodes.empty).extract(sources) + + // Then + assertDSs(result, expected) + } + + "extract" should "extract health care related services provided by medical practitioner in McoCe" in { + + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val medicalSpeCodes = SimpleExtractorCodes(List("1")) + val input = spark.read.parquet("src/test/resources/test-input/MCO_CE.parquet") + val sources = Sources(mcoCe = Some(input)) + + val expected = Seq[Event[PractitionerClaimSpeciality]]( + McoCeFbstcMedicalPractitionerClaim("2004100010", "390780146_00064268_2014", "1", makeTS(2014, 7, 18)) + ).toDS + + + // When + val result = McoCeFbstcSpecialtyExtractor(medicalSpeCodes).extract(sources) + + // Then + assertDSs(result, expected) + } + + "extract" should "extract all health care related services provided by medical practitioner in McoCe__Fbstc" in { + + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val input = spark.read.parquet("src/test/resources/test-input/MCO_CE.parquet") + val sources = Sources(mcoCe = Some(input)) + + val expected = Seq[Event[PractitionerClaimSpeciality]]( + McoCeFbstcMedicalPractitionerClaim("2004100010", "390780146_00064268_2014", "1", makeTS(2014, 7, 18)), + McoCeFbstcMedicalPractitionerClaim("2004100010", "390780146_00114237_2014", "22", makeTS(2014, 12, 12)) + ).toDS + + + // When + val result = McoCeFbstcSpecialtyExtractor(SimpleExtractorCodes.empty).extract(sources) + + // Then + assertDSs(result, expected) + } + + "extract" should "extract all health care related services provided by medical practitioner in McoCe__Fcstc" in { + + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val medicalSpeCodes = List.empty + val input = spark.read.parquet("src/test/resources/test-input/MCO_CE.parquet") + val sources = Sources(mcoCe = Some(input)) + + val expected = Seq[Event[PractitionerClaimSpeciality]]( + McoCeFcstcMedicalPractitionerClaim("2004100010", "390780146_00114237_2014", "1", makeTS(2014, 12, 12)), + McoCeFcstcMedicalPractitionerClaim("2004100010", "710780214_00000130_2014", "25", makeTS(2014, 4, 15)), + McoCeFcstcMedicalPractitionerClaim("2004100010", "390780146_00026744_2014", "13", makeTS(2014, 4, 4)) + ).toDS + + + // When + val result = McoCeFcstcSpecialtyExtractor(SimpleExtractorCodes.empty).extract(sources) + + // Then + assertDSs(result, expected) + } +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/takeOverReasons/HadTakeOveReasonSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/takeoverreasons/HadTakeOveReasonSuite.scala similarity index 77% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/takeOverReasons/HadTakeOveReasonSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/takeoverreasons/HadTakeOveReasonSuite.scala index fbcc5ecd..dc3dded6 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/takeOverReasons/HadTakeOveReasonSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/events/takeoverreasons/HadTakeOveReasonSuite.scala @@ -1,18 +1,19 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.takeOverReasons +package fr.polytechnique.cmap.cnam.etl.extractors.events.takeoverreasons import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.events._ +import fr.polytechnique.cmap.cnam.etl.events.{Event, HadAssociatedTakeOver, HadMainTakeOver, MedicalTakeOverReason} +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes import fr.polytechnique.cmap.cnam.etl.sources.Sources import fr.polytechnique.cmap.cnam.util.functions._ class HadTakeOveReasonSuite extends SharedContext { - + "extract" should "return a DataSet of HadMainTakeOveReasons" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val takeOverReasonCodes = Set("1") + val takeOverReasonCodes = SimpleExtractorCodes(List("1")) val had = spark.read.parquet("src/test/resources/test-input/HAD.parquet") val expected = Seq[Event[MedicalTakeOverReason]]( HadMainTakeOver("patient01", "10000123_30000123_2019", "1", makeTS(2019, 11, 21)) @@ -20,7 +21,7 @@ class HadTakeOveReasonSuite extends SharedContext { val input = Sources(had = Some(had)) // When - val result = HadMainTakeOverExtractor.extract(input, takeOverReasonCodes) + val result = HadMainTakeOverExtractor(takeOverReasonCodes).extract(input) // Then assertDSs(result, expected) @@ -40,7 +41,7 @@ class HadTakeOveReasonSuite extends SharedContext { val input = Sources(had = Some(had)) // When - val result = HadMainTakeOverExtractor.extract(input, Set.empty) + val result = HadMainTakeOverExtractor(SimpleExtractorCodes.empty).extract(input) // Then assertDSs(result, expected) @@ -60,7 +61,7 @@ class HadTakeOveReasonSuite extends SharedContext { val input = Sources(had = Some(had)) // When - val result = HadAssociatedTakeOverExtractor.extract(input, Set.empty) + val result = HadAssociatedTakeOverExtractor(SimpleExtractorCodes.empty).extract(input) // Then assertDSs(result, expected) diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/hospitalstays/McoHospitalStayExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/hospitalstays/McoHospitalStayExtractorSuite.scala deleted file mode 100644 index c861f633..00000000 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/hospitalstays/McoHospitalStayExtractorSuite.scala +++ /dev/null @@ -1,61 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors.hospitalstays - -import org.apache.spark.sql.Dataset -import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.events.{Event, HospitalStay, McoHospitalStay} -import fr.polytechnique.cmap.cnam.etl.sources.Sources -import fr.polytechnique.cmap.cnam.util.functions.makeTS - -class McoHospitalStayExtractorSuite extends SharedContext { - - "extract" should "return the hospital stays from mco sources" in { - //Given - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - val mco = spark.read.parquet("src/test/resources/test-input/MCO.parquet") - val sources = Sources(mco = Some(mco)) - - val expected: Dataset[Event[HospitalStay]] = Seq( - McoHospitalStay("Patient_02", "10000123_20000123_2007", makeTS(2007, 2, 1), makeTS(2007, 2, 10)), - McoHospitalStay("Patient_02", "10000123_20000345_2007", makeTS(2007, 2, 1), makeTS(2007, 2, 10)), - McoHospitalStay("Patient_02", "10000123_30000546_2008", makeTS(2008, 3, 1), makeTS(2008, 3, 10)), - McoHospitalStay("Patient_02", "10000123_30000852_2008", makeTS(2008, 3, 1), makeTS(2008, 3, 10)), - McoHospitalStay("Patient_02", "10000123_10000987_2006", makeTS(2006, 1, 1), makeTS(2006, 1, 10)), - McoHospitalStay("Patient_02", "10000123_10000543_2006", makeTS(2006, 1, 1), makeTS(2006, 1, 10)) - ).toDS() - - //When - val result: Dataset[Event[HospitalStay]] = McoHospitalStaysExtractor.extract(sources, Set.empty) - - //Then - assertDSs(expected, result) - } - - "extract" should "return the hospital stays from mco sources with non empty set codes" in { - //Given - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - val mco = spark.read.parquet("src/test/resources/test-input/MCO.parquet") - val sources = Sources(mco = Some(mco)) - - val expected: Dataset[Event[HospitalStay]] = Seq( - McoHospitalStay("Patient_02", "10000123_20000123_2007", makeTS(2007, 2, 1), makeTS(2007, 2, 10)), - McoHospitalStay("Patient_02", "10000123_20000345_2007", makeTS(2007, 2, 1), makeTS(2007, 2, 10)), - McoHospitalStay("Patient_02", "10000123_30000546_2008", makeTS(2008, 3, 1), makeTS(2008, 3, 10)), - McoHospitalStay("Patient_02", "10000123_30000852_2008", makeTS(2008, 3, 1), makeTS(2008, 3, 10)), - McoHospitalStay("Patient_02", "10000123_10000987_2006", makeTS(2006, 1, 1), makeTS(2006, 1, 10)), - McoHospitalStay("Patient_02", "10000123_10000543_2006", makeTS(2006, 1, 1), makeTS(2006, 1, 10)) - ).toDS() - - //When - val result: Dataset[Event[HospitalStay]] = McoHospitalStaysExtractor.extract(sources, Set("Test")) - - //Then - assertDSs(expected, result) - } - -} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/PatientsSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/AllPatientExtractorSuite.scala similarity index 53% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/PatientsSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/AllPatientExtractorSuite.scala index 236855a3..f82bdc37 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/PatientsSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/AllPatientExtractorSuite.scala @@ -4,44 +4,41 @@ package fr.polytechnique.cmap.cnam.etl.extractors.patients import org.apache.spark.sql.{Column, DataFrame, Dataset} import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.events.Event import fr.polytechnique.cmap.cnam.etl.patients.Patient import fr.polytechnique.cmap.cnam.etl.sources.Sources import fr.polytechnique.cmap.cnam.util.functions._ -class PatientsSuite extends SharedContext { +class AllPatientExtractorSuite extends SharedContext { "isDeathDateValid" should "remove absurd deathDate" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val df: DataFrame = Seq( - (makeTS(1989, 3, 13), makeTS(2029, 3, 13)), - (makeTS(1989, 3, 13), makeTS(2009, 3, 13)), - (makeTS(1989, 3, 13), makeTS(1979, 3, 13)) - ).toDF("birthDate", "deathDate") + val ds: Dataset[Patient] = Seq( + Patient("Patient_01", 1, makeTS(1989, 3, 13), Some(makeTS(2009, 3, 13))), + Patient("Patient_02", 2, makeTS(1989, 3, 13), Some(makeTS(1979, 3, 13))) + ).toDS() - val deathDates: Column = df("deathDate") - val birthDates: Column = df("birthDate") + val deathDates: Column = ds("deathDate") + val birthDates: Column = ds("birthDate") val expected = 1 // When - val result = df - .filter(Patients.validateDeathDate(deathDates, birthDates, 2020) === true) + val result = ds + .filter(AllPatientExtractor.validateDeathDate(deathDates, birthDates) === true) .count // Then assert(result == expected) } - "transform" should "return the correct data in a Dataset[Patient] for a known input" in { + "extract" should "return the correct data in a Dataset[Patient] for a known input" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val config = PatientsConfig(ageReferenceDate = makeTS(2006, 1, 1)) val dcirDf: DataFrame = Seq( ("Patient_01", 2, 31, 1945, Some(makeTS(2006, 1, 15)), None), ("Patient_01", 2, 31, 1945, Some(makeTS(2006, 1, 30)), None), @@ -60,11 +57,6 @@ class PatientsSuite extends SharedContext { ("Patient_04", 3, 5, 1995) ).toDF("NUM_ENQ", "MCO_B__SOR_MOD", "SOR_MOI", "SOR_ANN") - val ssrDf: DataFrame = Seq( - "Patient_01", - "Patient_05" - ).toDF("SSR_C__NUM_ENQ") - val irBenDf: DataFrame = Seq( ("Patient_01", 1, 1, 1945, None), ("Patient_02", 1, 2, 1956, Some(makeTS(2009, 3, 13))), @@ -92,16 +84,69 @@ class PatientsSuite extends SharedContext { dcir = Some(dcirDf), mco = Some(mcoDf), irBen = Some(irBenDf), - mcoCe = Some(mcoceDf), - ssr = Some(ssrDf)) + mcoCe = Some(mcoceDf) + ) + + // When + val result = AllPatientExtractor.extract(sources) + val expected: Dataset[Patient] = Seq( + Patient("Patient_01", 1, makeTS(1945, 1, 1), None), + Patient("Patient_02", 1, makeTS(1956, 2, 1), Some(makeTS(2009, 3, 13))), + Patient("Patient_03", 2, makeTS(1937, 3, 1), Some(makeTS(1980, 4, 1))), + Patient("Patient_04", 2, makeTS(1966, 2, 1), Some(makeTS(2020, 3, 13))), + Patient("Patient_05", 1, makeTS(1935, 4, 1), Some(makeTS(2008, 3, 13))), + Patient("Patient_06", 1, makeTS(1920, 8, 1), Some(makeTS(1980, 8, 1))) + ).toDS() + + // Then + assertDSs(result, expected) + } + + "extractBis" should "return the correct data in a Dataset[Patient] without MCO_CE" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val dcirDf: DataFrame = Seq( + ("Patient_01", 2, 31, 1945, Some(makeTS(2006, 1, 15)), None), + ("Patient_01", 2, 31, 1945, Some(makeTS(2006, 1, 30)), None), + ("Patient_02", 1, 47, 1959, Some(makeTS(2006, 1, 15)), Some(makeTS(2009, 3, 13))), + ("Patient_02", 1, 47, 1959, Some(makeTS(2006, 1, 30)), Some(makeTS(2009, 3, 13))), + ("Patient_03", 1, 47, 1959, Some(makeTS(2006, 1, 30)), None), + ("Patient_04", 1, 51, 1966, Some(makeTS(2006, 1, 5)), Some(makeTS(2009, 3, 13))), + ("Patient_04", 1, 51, 1966, Some(makeTS(2006, 2, 5)), None), + ("Patient_04", 2, 51, 1966, Some(makeTS(2006, 3, 5)), None) + ).toDF("NUM_ENQ", "BEN_SEX_COD", "BEN_AMA_COD", "BEN_NAI_ANN", "EXE_SOI_DTD", "BEN_DCD_DTE") + + val mcoDf: DataFrame = Seq( + ("Patient_01", 1, 2, 1985), + ("Patient_02", 9, 3, 1986), + ("Patient_03", 9, 4, 1980), + ("Patient_04", 3, 5, 1995) + ).toDF("NUM_ENQ", "MCO_B__SOR_MOD", "SOR_MOI", "SOR_ANN") + + val irBenDf: DataFrame = Seq( + ("Patient_01", 1, 1, 1945, None), + ("Patient_02", 1, 2, 1956, Some(makeTS(2009, 3, 13))), + ("Patient_03", 2, 3, 1937, Some(makeTS(1936, 3, 13))), + ("Patient_04", 2, 2, 1966, Some(makeTS(2020, 3, 13))), + ("Patient_05", 1, 4, 1935, Some(makeTS(2008, 3, 13))), + ("Patient_06", 1, 8, 1920, Some(makeTS(1980, 8, 1))) + ).toDF("NUM_ENQ", "BEN_SEX_COD", "BEN_NAI_MOI", "BEN_NAI_ANN", "BEN_DCD_DTE") + + val sources = new Sources( + dcir = Some(dcirDf), + mco = Some(mcoDf), + irBen = Some(irBenDf) + ) // When - val result = new Patients(config).extract(sources) + val result = AllPatientExtractor.extract(sources) val expected: Dataset[Patient] = Seq( Patient("Patient_01", 1, makeTS(1945, 1, 1), None), Patient("Patient_02", 1, makeTS(1956, 2, 1), Some(makeTS(2009, 3, 13))), Patient("Patient_03", 2, makeTS(1937, 3, 1), Some(makeTS(1980, 4, 1))), - Patient("Patient_04", 2, makeTS(1966, 2, 1), Some(makeTS(2009, 3, 13))), + Patient("Patient_04", 2, makeTS(1966, 2, 1), Some(makeTS(2020, 3, 13))), Patient("Patient_05", 1, makeTS(1935, 4, 1), Some(makeTS(2008, 3, 13))), Patient("Patient_06", 1, makeTS(1920, 8, 1), Some(makeTS(1980, 8, 1))) ).toDS() diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/DcirPatientsSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/DcirPatientsSuite.scala index cbd21efb..798a477d 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/DcirPatientsSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/DcirPatientsSuite.scala @@ -2,140 +2,160 @@ package fr.polytechnique.cmap.cnam.etl.extractors.patients -import java.sql.{Date, Timestamp} -import org.apache.spark.sql.DataFrame +import java.sql.Timestamp +import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.SharedContext import fr.polytechnique.cmap.cnam.etl.patients.Patient +import fr.polytechnique.cmap.cnam.etl.sources.Sources class DcirPatientsSuite extends SharedContext { - import fr.polytechnique.cmap.cnam.etl.extractors.patients.DcirPatients.DcirPatientsDataFrame + "findPatientBirthDate" should "return a Dataset with the birth date for each patient" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val input: Dataset[PatientDcir] = Seq( + PatientDcir("Patient_01", 2, 31, "1975", null, Timestamp.valueOf("2006-01-15 00:00:00"), None), + PatientDcir("Patient_02", 1, 47, "1959", null, Timestamp.valueOf("2006-01-05 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))) + ).toDS() + + val expected: Dataset[PatientDcir] = Seq( + PatientDcir("Patient_01", 2, 31, "1975", Timestamp.valueOf("1975-01-01 00:00:00"), Timestamp.valueOf("2006-01-15 00:00:00"), None), + PatientDcir("Patient_02", 1, 47, "1959", Timestamp.valueOf("1959-01-01 00:00:00"), Timestamp.valueOf("2006-01-05 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))) + ).toDS() + + // When + val result = DcirPatients.findPatientBirthDate(input) + + // Then + assertDSs(result, expected) + } + + "findGender" should "return a Dataset with the correct gender for each patient" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val input: Dataset[PatientDcir] = Seq( + PatientDcir("Patient_01", 2, 31, "1975", null, Timestamp.valueOf("2006-01-15 00:00:00"), None), + PatientDcir("Patient_01", 1, 31, "1975", null, Timestamp.valueOf("2006-01-15 00:00:00"), None), + PatientDcir("Patient_01", 2, 31, "1975", null, Timestamp.valueOf("2006-01-30 00:00:00"), None), + PatientDcir("Patient_02", 1, 47, "1975", null, Timestamp.valueOf("2006-01-05 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))), + PatientDcir("Patient_02", 1, 47, "1975", null, Timestamp.valueOf("2006-01-15 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))), + PatientDcir("Patient_02", 1, 47, "1975", null, Timestamp.valueOf("2006-01-30 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))), + PatientDcir("Patient_02", 1, 47, "1975", null, Timestamp.valueOf("2006-01-30 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))) + ).toDS() + + val expected: Dataset[PatientDcir] = Seq( + PatientDcir("Patient_01", 2, 31, "1975", null, Timestamp.valueOf("2006-01-15 00:00:00"), None), + PatientDcir("Patient_01", 2, 31, "1975", null, Timestamp.valueOf("2006-01-15 00:00:00"), None), + PatientDcir("Patient_01", 2, 31, "1975", null, Timestamp.valueOf("2006-01-30 00:00:00"), None), + PatientDcir("Patient_02", 1, 47, "1975", null, Timestamp.valueOf("2006-01-05 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))), + PatientDcir("Patient_02", 1, 47, "1975", null, Timestamp.valueOf("2006-01-15 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))), + PatientDcir("Patient_02", 1, 47, "1975", null, Timestamp.valueOf("2006-01-30 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))), + PatientDcir("Patient_02", 1, 47, "1975", null, Timestamp.valueOf("2006-01-30 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))) + ).toDS() + + // When + val result = DcirPatients.findPatientGender(input) - "findBirthYears" should "return a DataFrame with the birth year for each patient" in { + // Then + assertDSs(result, expected) + } + + "findDeathDate" should "return a Dataset with the correct death date for each patient" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val input: DataFrame = Seq( - ("Patient_01", "1975"), - ("Patient_01", "1975"), - ("Patient_01", "000000"), - ("Patient_01", "999999"), - ("Patient_01", "2075"), - ("Patient_01", "1975"), - ("Patient_02", "1959"), - ("Patient_02", "1959"), - ("Patient_02", "9999"), - ("Patient_02", "9999") - ).toDF("patientID", "birthYear") - - val expectedResult: DataFrame = Seq( - ("Patient_01", "1975"), - ("Patient_02", "1959") - ).toDF("patientID", "birthYear") + val input: Dataset[PatientDcir] = Seq( + PatientDcir("Patient_01", 1, 1, "1975", null, Timestamp.valueOf("1975-01-01 00:00:00"), Some(null.asInstanceOf[Timestamp])), + PatientDcir("Patient_02", 1, 34, "1975", null, Timestamp.valueOf("1975-01-01 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))), + PatientDcir("Patient_02", 1, 30, "1975", null, Timestamp.valueOf("1975-01-01 00:00:00"), Some(Timestamp.valueOf("2015-03-13 00:00:00"))), + PatientDcir("Patient_03", 1, 1, "1975", null, Timestamp.valueOf("1975-01-01 00:00:00"), Some(Timestamp.valueOf("1976-03-13 00:00:00"))), + PatientDcir("Patient_04", 1, 45, "1975", null, Timestamp.valueOf("1975-01-01 00:00:00"), Some(Timestamp.valueOf("2020-03-13 00:00:00"))) + ).toDS() + + val expected: Dataset[PatientDcir] = Seq( + PatientDcir("Patient_01", 1, 1, "1975", null, Timestamp.valueOf("1975-01-01 00:00:00"), Some(null.asInstanceOf[Timestamp])), + PatientDcir("Patient_04", 1, 45, "1975", null, Timestamp.valueOf("1975-01-01 00:00:00"), Some(Timestamp.valueOf("2020-03-13 00:00:00"))), + PatientDcir("Patient_03", 1, 1, "1975", null, Timestamp.valueOf("1975-01-01 00:00:00"), Some(Timestamp.valueOf("1976-03-13 00:00:00"))), + PatientDcir("Patient_02", 1, 34, "1975", null, Timestamp.valueOf("1975-01-01 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))), + PatientDcir("Patient_02", 1, 30, "1975", null, Timestamp.valueOf("1975-01-01 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))) + ).toDS() // When - val result = input.findBirthYears + val result = DcirPatients.findPatientDeathDate(input) // Then - assertDFs(result, expectedResult) + assertDSs(result, expected) } - "groupByIdAndAge" should "return a DataFrame with data aggregated by patient ID and age" in { + "convert PatientDcirtoPatient" should "return Dataset of Patients" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val givenDf: DataFrame = sqlContext.read.parquet("src/test/resources/expected/DCIR.parquet") - val input: DataFrame = Seq( - ("Patient_01", 31, 2, Date.valueOf("2006-01-15"), None), - ("Patient_01", 31, 2, Date.valueOf("2006-01-15"), None), - ("Patient_01", 31, 2, Date.valueOf("2006-01-30"), None), - ("Patient_02", 47, 1, Date.valueOf("2006-01-05"), Some(Date.valueOf("2009-03-13"))), - ("Patient_02", 47, 1, Date.valueOf("2006-01-15"), Some(Date.valueOf("2009-03-13"))), - ("Patient_02", 47, 1, Date.valueOf("2006-01-30"), Some(Date.valueOf("2009-03-13"))), - ("Patient_02", 47, 1, Date.valueOf("2006-01-30"), Some(Date.valueOf("2009-03-13"))) - ).toDF("patientID", "age", "gender", "eventDate", "deathDate") - - val expected: DataFrame = Seq( - ("Patient_01", 31, 3L, 6L, Date.valueOf("2006-01-15"), Date.valueOf("2006-01-30"), - None), - ("Patient_02", 47, 4L, 4L, Date.valueOf("2006-01-05"), Date.valueOf("2006-01-30"), - Some(Date.valueOf("2009-03-13"))) - ).toDF( - "patientID", "age", "genderCount", "genderSum", "minEventDate", "maxEventDate", - "deathDate" - ) + val input: Dataset[PatientDcir] = Seq( + PatientDcir("Patient_01", 2, 45, "1975", Timestamp.valueOf("1975-01-01 00:00:00"), Timestamp.valueOf("1975-01-01 00:00:00"), None), + PatientDcir("Patient_02", 1, 49, "1959", Timestamp.valueOf("1959-01-01 00:00:00"), Timestamp.valueOf("1959-01-01 00:00:00"), Some(Timestamp.valueOf("2008-01-25 00:00:00"))) + ).toDS() + + val expected: Dataset[Patient] = Seq( + Patient("Patient_01", 2, Timestamp.valueOf("1975-01-01 00:00:00"), None), + Patient("Patient_02", 1, Timestamp.valueOf("1959-01-01 00:00:00"), Some(Timestamp.valueOf("2008-01-25 00:00:00"))) + ).toDS() // When - val result = input.groupByIdAndAge + val result = DcirPatients.fromDerivedPatienttoPatient(input) // Then - assertDFs(result, expected) + assertDSs(result, expected) } - "estimateFields" should "return a Dataset[Patient] from a DataFrame with aggregated data" in { + "getInput" should "read file" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val input: DataFrame = Seq( - ("Patient_01", 31, 3, 6, "1975", Date.valueOf("2006-01-15"), - Date.valueOf("2006-01-30"), None), - ("Patient_02", 47, 4, 4, "1959", Date.valueOf("2006-01-05"), - Date.valueOf("2006-01-30"), Some(Date.valueOf("2009-03-13"))) - ).toDF( - "patientID", "age", "genderCount", "genderSum", "birthYear", "minEventDate", - "maxEventDate", "deathDate" - ) - - val expected: DataFrame = Seq( - Patient( - patientID = "Patient_01", - gender = 2, - birthDate = Timestamp.valueOf("1975-01-01 00:00:00"), - deathDate = None - ), - Patient( - patientID = "Patient_02", - gender = 1, - birthDate = Timestamp.valueOf("1959-01-01 00:00:00"), - deathDate = Some(Timestamp.valueOf("2009-03-13 00:00:00")) - ) - ).toDF + val dcir = spark.read.parquet("src/test/resources/test-input/DCIR.parquet") + val sources = Sources(dcir = Some(dcir)) + + val expected: Dataset[PatientDcir] = Seq( + PatientDcir("Patient_01", 2, 31, "1975", null, null, null), + PatientDcir("Patient_01", 2, 31, "1975", null, Timestamp.valueOf("2006-01-15 00:00:00"), null), + PatientDcir("Patient_01", 2, 31, "1975", null, Timestamp.valueOf("2006-01-30 00:00:00"), null), + PatientDcir("Patient_02", 1, 47, "1959", null, Timestamp.valueOf("2006-01-15 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))), + PatientDcir("Patient_02", 1, 47, "1959", null, Timestamp.valueOf("2006-01-30 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))), + PatientDcir("Patient_02", 1, 47, "1959", null, Timestamp.valueOf("2006-01-30 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))), + PatientDcir("Patient_02", 1, 47, "1959", null, Timestamp.valueOf("2006-01-05 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))) + ).toDS() // When - val result = input.estimateFields.toDF + val result = DcirPatients.getInput(sources) // Then - assertDFs(result, expected) + assertDSs(result, expected) } - "transform" should "return the correct data in a Dataset[Patient] for a known input" in { + "extract" should "build patients with actual data" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val dcir: DataFrame = sqlCtx.read.parquet("src/test/resources/expected/DCIR.parquet") - val expected: DataFrame = Seq( - Patient( - patientID = "Patient_01", - gender = 2, - birthDate = Timestamp.valueOf("1975-01-01 00:00:00"), - deathDate = None - ), - Patient( - patientID = "Patient_02", - gender = 1, - birthDate = Timestamp.valueOf("1959-01-01 00:00:00"), - deathDate = Some(Timestamp.valueOf("2009-03-13 00:00:00")) - ) - ).toDF + val dcir = spark.read.parquet("src/test/resources/test-input/DCIR.parquet") + val sources = Sources(dcir = Some(dcir)) + + val expected: Dataset[Patient] = Seq( + Patient("Patient_01", 2, Timestamp.valueOf("1975-01-01 00:00:00"), None), + Patient("Patient_02", 1, Timestamp.valueOf("1959-01-01 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))) + ).toDS() // When - val result = DcirPatients.extract(dcir, 1, 2, 1900, 2020).toDF + val result = DcirPatients.extract(sources) // Then - assertDFs(result, expected) + assertDSs(result, expected) } } diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/HadPatientsSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/HadPatientsSuite.scala index d949a0f5..d6409102 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/HadPatientsSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/HadPatientsSuite.scala @@ -1,96 +1,141 @@ package fr.polytechnique.cmap.cnam.etl.extractors.patients import java.sql.Timestamp - +import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.SharedContext -import org.apache.spark.sql.DataFrame -import org.apache.spark.sql.functions._ +import fr.polytechnique.cmap.cnam.etl.patients.Patient +import fr.polytechnique.cmap.cnam.etl.sources.Sources class HadPatientsSuite extends SharedContext { - import fr.polytechnique.cmap.cnam.etl.extractors.patients.HadPatients.HadPatientsDataFrame + "findBirthDate" should "return the same dataset" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val input: Dataset[PatientHad] = Seq( + PatientHad("Patient_01", 9, "01", "2009", 1, null, Some(Timestamp.valueOf("2009-01-01 00:00:00"))), + PatientHad("Patient_02", 9, "03", "2010", 2, null, Some(Timestamp.valueOf("2010-03-01 00:00:00"))) + ).toDS() + + val expected: Dataset[PatientHad] = Seq( + PatientHad("Patient_01", 9, "01", "2009", 1, null, Some(Timestamp.valueOf("2009-01-01 00:00:00"))), + PatientHad("Patient_02", 9, "03", "2010", 2, null, Some(Timestamp.valueOf("2010-03-01 00:00:00"))) + ).toDS() + + // When + val result: Dataset[PatientHad] = HadPatients.findPatientBirthDate(input) - "getDeathDates" should "collect death dates correctly from flat HAD" in { + // Then + assertDSs(result, expected) + } + + "findGender" should "return the same dataset" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val input: DataFrame = Seq( - ("Patient_01", 1, 2, 1983), - ("Patient_02", 9, 3, 1986) - ).toDF("patientID", "SOR_MOD", "SOR_MOI", "SOR_ANN") + val input: Dataset[PatientHad] = Seq( + PatientHad("Patient_01", 9, "01", "2009", 1, null, Some(Timestamp.valueOf("2009-01-01 00:00:00"))), + PatientHad("Patient_02", 9, "03", "2010", 2, null, Some(Timestamp.valueOf("2010-03-01 00:00:00"))) + ).toDS() + + val expected: Dataset[PatientHad] = Seq( + PatientHad("Patient_01", 9, "01", "2009", 1, null, Some(Timestamp.valueOf("2009-01-01 00:00:00"))), + PatientHad("Patient_02", 9, "03", "2010", 2, null, Some(Timestamp.valueOf("2010-03-01 00:00:00"))) + ).toDS() + + // When + val result: Dataset[PatientHad] = HadPatients.findPatientGender(input) - val expected: DataFrame = Seq( - ("Patient_02", Timestamp.valueOf("1986-03-01 00:00:00")) - ).toDF("patientID", "deathDate") + // Then + assertDSs(result, expected) + } + + "findDeathDate" should "choose minimum death date if a patient has more than one death dates" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val input: Dataset[PatientHad] = Seq( + PatientHad("Patient_01", 1, "2", "1985", 1, null, None), + PatientHad("Patient_02", 9, "3", "1986", 1, null, Some(Timestamp.valueOf("1986-03-01 00:00:00"))), + PatientHad("Patient_03", 9, "4", "1980", 1, null, Some(Timestamp.valueOf("1980-04-01 00:00:00"))), + PatientHad("Patient_03", 9, "4", "1984", 1, null, Some(Timestamp.valueOf("1984-04-01 00:00:00"))), + PatientHad("Patient_04", 3, "5", "1995", 1, null, None) + ).toDS() + + val expected: Dataset[PatientHad] = Seq( + PatientHad("Patient_02", 9, "3", "1986", 1, null, Some(Timestamp.valueOf("1986-03-01 00:00:00"))), + PatientHad("Patient_03", 9, "4", "1980", 1, null, Some(Timestamp.valueOf("1980-04-01 00:00:00"))) + ).toDS() // When - val result: DataFrame = input.getDeathDates(9).select(col("patientID"), col("deathDate")) + val result: Dataset[PatientHad] = HadPatients.findPatientDeathDate(input) // Then - assertDFs(result, expected) + assertDSs(result, expected) } - it should "choose minimum death date if a patient has more than one death dates" in { + "convert PatientHadtoPatient" should "return Dataset of Patients" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val input: DataFrame = Seq( - ("Patient_01", 9, 2, 1985), - ("Patient_01", 9, 4, 1980) - ).toDF("patientID", "SOR_MOD", "SOR_MOI", "SOR_ANN") + val input: Dataset[PatientHad] = Seq( + PatientHad("Patient_01", 9, "01", "2009", 1, null, Some(Timestamp.valueOf("2009-01-01 00:00:00"))), + PatientHad("Patient_02", 9, "03", "2000", 2, null, Some(Timestamp.valueOf("2000-03-01 00:00:00"))) + ).toDS() - val expected: DataFrame = Seq( - ("Patient_01", Timestamp.valueOf("1980-04-01 00:00:00")) - ).toDF("patientID", "deathDate") + val expected: Dataset[Patient] = Seq( + Patient("Patient_01", 1, null, Some(Timestamp.valueOf("2009-01-01 00:00:00"))), + Patient("Patient_02", 2, null, Some(Timestamp.valueOf("2000-03-01 00:00:00"))) + ).toDS() // When - val result: DataFrame = input.getDeathDates(9).select(col("patientID"), col("deathDate")) + val result = HadPatients.fromDerivedPatienttoPatient(input) // Then - assertDFs(result, expected) + assertDSs(result, expected) } - "transform" should "return correct Dataset" in { + "getInput" should "read file" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val had: DataFrame = Seq( - ("Patient_01", 1, 2, 1980), - ("Patient_02", 9, 3, 1986), - ("Patient_03", 9, 4, 1980), - ("Patient_03", 9, 4, 1984), - ("Patient_04", 3, 5, 1995) - ).toDF("NUM_ENQ", "HAD_B__SOR_MOD", "HAD_B__SOR_MOI", "HAD_B__SOR_ANN") - - val expected: DataFrame = Seq( - ("Patient_02", Timestamp.valueOf("1986-03-01 00:00:00")), - ("Patient_03", Timestamp.valueOf("1980-04-01 00:00:00")) - ).toDF("patientID", "deathDate") + val had = spark.read.parquet("src/test/resources/test-input/HAD.parquet") + val sources = Sources(had = Some(had)) + + val expected: Dataset[PatientHad] = Seq( + PatientHad("patient01", 8, "1", "2019", 0, null, Some(Timestamp.valueOf("2019-01-01 00:00:00"))), + PatientHad("patient01", -1, "1", "2019", 0, null, Some(Timestamp.valueOf("2019-01-01 00:00:00"))), + PatientHad("patient02", 0, "1", "2019", 0, null, Some(Timestamp.valueOf("2019-01-01 00:00:00"))), + PatientHad("patient02", 0, "1", "2019", 0, null, Some(Timestamp.valueOf("2019-01-01 00:00:00"))) + ).toDS() // When - val result = HadPatients.extract(had) + val result = HadPatients.getInput(sources) // Then - assertDFs(result.toDF, expected) + assertDSs(result, expected) } - "extract" should "extract target HadPatients" in { + "extract" should "build patients with actual data" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given val had = spark.read.parquet("src/test/resources/test-input/HAD.parquet") + val sources = Sources(had = Some(had)) - val result = HadPatients.extract(had) + val expected: Dataset[Patient] = Seq.empty[Patient].toDS() - val expected: DataFrame = Seq.empty[ - (String, Timestamp) - ].toDF("patientID", "deathDate") + // When + val result = HadPatients.extract(sources) // Then assertDSs(result, expected) } + } \ No newline at end of file diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/IrBenPatientsSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/IrBenPatientsSuite.scala index 2c701559..69c6c377 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/IrBenPatientsSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/IrBenPatientsSuite.scala @@ -3,157 +3,144 @@ package fr.polytechnique.cmap.cnam.etl.extractors.patients import java.sql.Timestamp -import org.apache.spark.sql.DataFrame +import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.patients.Patient +import fr.polytechnique.cmap.cnam.etl.sources.Sources class IrBenPatientsSuite extends SharedContext { - import fr.polytechnique.cmap.cnam.etl.extractors.patients.IrBenPatients.IrBenPatientsDataFrame - - "getBirthDates" should "collect birth dates correctly from IR_BEN_R" in { + "findBirthDate" should "return correct birth dates" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val irBen: DataFrame = Seq( - ("Patient_01", 1, 1975), - ("Patient_02", 2, 1976), - ("Patient_03", 3, 1977), - ("Patient_04", 4, 1895) - ).toDF("patientID", "BEN_NAI_MOI", "BEN_NAI_ANN") - - val expected: DataFrame = Seq( - ("Patient_01", Timestamp.valueOf("1975-01-01 00:00:00")), - ("Patient_02", Timestamp.valueOf("1976-02-01 00:00:00")), - ("Patient_03", Timestamp.valueOf("1977-03-01 00:00:00")) - ).toDF("patientID", "birthDate") + val irBen: Dataset[PatientIrBen] = Seq( + PatientIrBen("Patient_01", 1, "1", "1975", Timestamp.valueOf("1975-01-01 00:00:00"), Some(null.asInstanceOf[Timestamp])), + PatientIrBen("Patient_02", 2, "2", "1976", Timestamp.valueOf("1976-02-01 00:00:00"), Some(null.asInstanceOf[Timestamp])) + ).toDS() - // When - val result = irBen.getBirthDate() + val expected: Dataset[PatientIrBen] = Seq( + PatientIrBen("Patient_01", 1, "1", "1975", Timestamp.valueOf("1975-01-01 00:00:00"), Some(null.asInstanceOf[Timestamp])), + PatientIrBen("Patient_02", 2, "2", "1976", Timestamp.valueOf("1976-02-01 00:00:00"), Some(null.asInstanceOf[Timestamp])) + ).toDS() - // Then - assertDFs(result, expected) - } - - it should "throw an exception in case of conflicting birth dates" in { - val sqlCtx = sqlContext - import sqlCtx.implicits._ + //When + val result = IrBenPatients.findPatientBirthDate(irBen) - // Given - val irBen: DataFrame = Seq( - ("Patient_01", 1, 1975), - ("Patient_01", 2, 1976) - ).toDF("patientID", "BEN_NAI_MOI", "BEN_NAI_ANN") - - // Then - intercept[Exception] { - irBen.getBirthDate() - } + //Then + assertDSs(result, expected) } - "getGender" should "return correct gender" in { + "findGender" should "return correct gender" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val input: DataFrame = Seq( - ("Patient_01", 1), - ("Patient_02", 2), - ("Patient_02", 2) - ).toDF("patientID", "BEN_SEX_COD") + val input: Dataset[PatientIrBen] = Seq( + PatientIrBen("Patient_01", 1, "1", "1975", Timestamp.valueOf("1975-01-01 00:00:00"), Some(null.asInstanceOf[Timestamp])), + PatientIrBen("Patient_02", 2, "2", "1976", Timestamp.valueOf("1976-02-01 00:00:00"), Some(null.asInstanceOf[Timestamp])) + ).toDS() - val expected: DataFrame = Seq( - ("Patient_01", 1), - ("Patient_02", 2) - ).toDF("patientID", "gender") + val expected: Dataset[PatientIrBen] = Seq( + PatientIrBen("Patient_01", 1, "1", "1975", Timestamp.valueOf("1975-01-01 00:00:00"), Some(null.asInstanceOf[Timestamp])), + PatientIrBen("Patient_02", 2, "2", "1976", Timestamp.valueOf("1976-02-01 00:00:00"), Some(null.asInstanceOf[Timestamp])) + ).toDS() // When - val result = input.getGender + val result = IrBenPatients.findPatientGender(input) // Then - assertDFs(result, expected) + assertDSs(result, expected) } - it should "throw an exception in case of conflicting sex code" in { + "findDeathDate" should "find death dates correctly" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val input: DataFrame = Seq( - ("Patient_01", 1), - ("Patient_01", 2) - ).toDF("patientID", "BEN_SEX_COD") + val input: Dataset[PatientIrBen] = Seq( + PatientIrBen("Patient_01", 1, "1", "1975", Timestamp.valueOf("1975-01-01 00:00:00"), Some(null.asInstanceOf[Timestamp])), + PatientIrBen("Patient_02", 1, "1", "1975", Timestamp.valueOf("1975-01-01 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))), + PatientIrBen("Patient_02", 1, "1", "1975", Timestamp.valueOf("1975-01-01 00:00:00"), Some(Timestamp.valueOf("2015-03-13 00:00:00"))), + PatientIrBen("Patient_03", 1, "1", "1975", Timestamp.valueOf("1975-01-01 00:00:00"), Some(Timestamp.valueOf("1976-03-13 00:00:00"))), + PatientIrBen("Patient_04", 1, "1", "1975", Timestamp.valueOf("1975-01-01 00:00:00"), Some(Timestamp.valueOf("2020-03-13 00:00:00"))) + ).toDS() + + val expected: Dataset[PatientIrBen] = Seq( + PatientIrBen("Patient_01", 1, "1", "1975", Timestamp.valueOf("1975-01-01 00:00:00"), Some(null.asInstanceOf[Timestamp])), + PatientIrBen("Patient_04", 1, "1", "1975", Timestamp.valueOf("1975-01-01 00:00:00"), Some(Timestamp.valueOf("2020-03-13 00:00:00"))), + PatientIrBen("Patient_03", 1, "1", "1975", Timestamp.valueOf("1975-01-01 00:00:00"), Some(Timestamp.valueOf("1976-03-13 00:00:00"))), + PatientIrBen("Patient_02", 1, "1", "1975", Timestamp.valueOf("1975-01-01 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))), + PatientIrBen("Patient_02", 1, "1", "1975", Timestamp.valueOf("1975-01-01 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))) + ).toDS() + + // When + val result = IrBenPatients.findPatientDeathDate(input) // Then - intercept[Exception] { - input.getGender - } + assertDSs(result, expected) } - "getDeathDate" should "collect death dates correctly from IR_BEN_R" in { + "convert PatientIrBentoPatient" should "return Dataset of Patients" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val irBen: DataFrame = Seq( - ("Patient_01", None), - ("Patient_02", Some(Timestamp.valueOf("2009-03-13 00:00:00"))), - ("Patient_03", Some(Timestamp.valueOf("1976-03-13 00:00:00"))), - ("Patient_04", Some(Timestamp.valueOf("2020-03-13 00:00:00"))) - ).toDF("patientID", "BEN_DCD_DTE") - - val expected: DataFrame = Seq( - ("Patient_04", Timestamp.valueOf("2020-03-13 00:00:00")), - ("Patient_03", Timestamp.valueOf("1976-03-13 00:00:00")), - ("Patient_02", Timestamp.valueOf("2009-03-13 00:00:00")) - ).toDF("patientID", "deathDate") + val input: Dataset[PatientIrBen] = Seq( + PatientIrBen("Patient_01", 1, "1", "1975", Timestamp.valueOf("1975-01-01 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))), + PatientIrBen("Patient_02", 2, "3", "1977", Timestamp.valueOf("1977-03-01 00:00:00"), Some(null.asInstanceOf[Timestamp])) + ).toDS() + + val expected: Dataset[Patient] = Seq( + Patient("Patient_01", 1, Timestamp.valueOf("1975-01-01 00:00:00"), Some(Timestamp.valueOf("2009-03-13 00:00:00"))), + Patient("Patient_02", 2, Timestamp.valueOf("1977-03-01 00:00:00"), Some(null.asInstanceOf[Timestamp])) + ).toDS() // When - val result = irBen.getDeathDate + val result = IrBenPatients.fromDerivedPatienttoPatient(input) // Then - assertDFs(result, expected) + assertDSs(result, expected) } - "transform" should "return correct result" in { + "getInput" should "read file" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val irBen: DataFrame = Seq( - ("Patient_01", 1, 1, 1975, Timestamp.valueOf("2009-03-13 00:00:00")), - ("Patient_02", 2, 3, 1977, null.asInstanceOf[Timestamp]), - ("Patient_02", 2, 4, 1895, null.asInstanceOf[Timestamp]) - ).toDF("NUM_ENQ", "BEN_SEX_COD", "BEN_NAI_MOI", "BEN_NAI_ANN", "BEN_DCD_DTE") + val irBen = spark.read.parquet("src/test/resources/test-input/IR_BEN_R.parquet") + val sources = Sources(irBen = Some(irBen)) - val expected: DataFrame = Seq( - ("Patient_01", 1, Timestamp.valueOf("1975-01-01 00:00:00"), Timestamp.valueOf("2009-03-13 00:00:00")), - ("Patient_02", 2, Timestamp.valueOf("1977-03-01 00:00:00"), null.asInstanceOf[Timestamp]) - ).toDF("patientID", "gender", "birthDate", "deathDate") + val expected: Dataset[PatientIrBen] = Seq( + PatientIrBen("Patient_01", 2, "01", "1975", Timestamp.valueOf("1975-01-01 00:00:00"), null), + PatientIrBen("Patient_02", 1, "10", "1959", Timestamp.valueOf("1959-10-01 00:00:00"), Some(Timestamp.valueOf("2008-01-25 00:00:00"))) + ).toDS() // When - val result = IrBenPatients.extract(irBen, 1900, 2020) + val result = IrBenPatients.getInput(sources) // Then - assertDFs(result.toDF, expected) + assertDSs(result, expected) } - it should "deal with actual data" in { + "extract" should "build patients with actual data" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val irBen = sqlCtx.read.load("src/test/resources/expected/IR_BEN_R.parquet") + val irBen = spark.read.parquet("src/test/resources/test-input/IR_BEN_R.parquet") + val sources = Sources(irBen = Some(irBen)) - val expected: DataFrame = Seq( - ("Patient_01", 2, Timestamp.valueOf("1975-01-01 00:00:00"), null.asInstanceOf[Timestamp]), - ("Patient_02", 1, Timestamp.valueOf("1959-10-01 00:00:00"), Timestamp.valueOf("2008-01-25 00:00:00")) - ).toDF("patientID", "gender", "birthDate", "deathDate") + val expected: Dataset[Patient] = Seq( + Patient("Patient_01", 2, Timestamp.valueOf("1975-01-01 00:00:00"), null), + Patient("Patient_02", 1, Timestamp.valueOf("1959-10-01 00:00:00"), Some(Timestamp.valueOf("2008-01-25 00:00:00"))) + ).toDS() // When - val result = IrBenPatients.extract(irBen, 1900, 2020) + val result = IrBenPatients.extract(sources) // Then - assertDFs(result.toDF, expected) + assertDSs(result, expected) } } diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/McoPatientsSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/McoPatientsSuite.scala index e5a6feee..c7ca45cf 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/McoPatientsSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/McoPatientsSuite.scala @@ -3,78 +3,142 @@ package fr.polytechnique.cmap.cnam.etl.extractors.patients import java.sql.Timestamp -import org.apache.spark.sql.DataFrame -import org.apache.spark.sql.functions._ +import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.patients.Patient +import fr.polytechnique.cmap.cnam.etl.sources.Sources class McoPatientsSuite extends SharedContext { - import fr.polytechnique.cmap.cnam.etl.extractors.patients.McoPatients.McoPatientsDataFrame + "findBirthDate" should "return the same dataset" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val input: Dataset[PatientMco] = Seq( + PatientMco("Patient_01", 9, "01", "2009", 1, null, Some(Timestamp.valueOf("2009-01-01 00:00:00"))), + PatientMco("Patient_02", 9, "03", "2010", 2, null, Some(Timestamp.valueOf("2010-03-01 00:00:00"))) + ).toDS() + + val expected: Dataset[PatientMco] = Seq( + PatientMco("Patient_01", 9, "01", "2009", 1, null, Some(Timestamp.valueOf("2009-01-01 00:00:00"))), + PatientMco("Patient_02", 9, "03", "2010", 2, null, Some(Timestamp.valueOf("2010-03-01 00:00:00"))) + ).toDS() + + // When + val result: Dataset[PatientMco] = McoPatients.findPatientBirthDate(input) + + // Then + assertDSs(result, expected) + } - "getDeathDates" should "collect death dates correctly from flat MCO" in { + "findGender" should "return the same dataset" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val input: DataFrame = Seq( - ("Patient_01", 1, 2, 1985), - ("Patient_02", 9, 3, 1986) - ).toDF("patientID", "SOR_MOD", "SOR_MOI", "SOR_ANN") + val input: Dataset[PatientMco] = Seq( + PatientMco("Patient_01", 9, "01", "2009", 1, null, Some(Timestamp.valueOf("2009-01-01 00:00:00"))), + PatientMco("Patient_02", 9, "03", "2010", 2, null, Some(Timestamp.valueOf("2010-03-01 00:00:00"))) + ).toDS() - val expected: DataFrame = Seq( - ("Patient_02", Timestamp.valueOf("1986-03-01 00:00:00")) - ).toDF("patientID", "deathDate") + val expected: Dataset[PatientMco] = Seq( + PatientMco("Patient_01", 9, "01", "2009", 1, null, Some(Timestamp.valueOf("2009-01-01 00:00:00"))), + PatientMco("Patient_02", 9, "03", "2010", 2, null, Some(Timestamp.valueOf("2010-03-01 00:00:00"))) + ).toDS() // When - val result: DataFrame = input.getDeathDates(9).select(col("patientID"), col("deathDate")) + val result: Dataset[PatientMco] = McoPatients.findPatientGender(input) // Then - assertDFs(result, expected) + assertDSs(result, expected) } - it should "choose minimum death date if a patient has more than one death dates" in { + "findDeathDate" should "choose minimum death date if a patient has more than one death dates" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val input: DataFrame = Seq( - ("Patient_01", 9, 2, 1985), - ("Patient_01", 9, 4, 1980) - ).toDF("patientID", "SOR_MOD", "SOR_MOI", "SOR_ANN") + val input: Dataset[PatientMco] = Seq( + PatientMco("Patient_01", 1, "2", "1985", 1, null, None), + PatientMco("Patient_02", 9, "3", "1986", 1, null, Some(Timestamp.valueOf("1986-03-01 00:00:00"))), + PatientMco("Patient_03", 9, "4", "1980", 1, null, Some(Timestamp.valueOf("1980-04-01 00:00:00"))), + PatientMco("Patient_03", 9, "4", "1984", 1, null, Some(Timestamp.valueOf("1984-04-01 00:00:00"))), + PatientMco("Patient_04", 3, "5", "1995", 1, null, None) + ).toDS() + + val expected: Dataset[PatientMco] = Seq( + PatientMco("Patient_02", 9, "3", "1986", 1, null, Some(Timestamp.valueOf("1986-03-01 00:00:00"))), + PatientMco("Patient_03", 9, "4", "1980", 1, null, Some(Timestamp.valueOf("1980-04-01 00:00:00"))) + ).toDS() + + // When + val result: Dataset[PatientMco] = McoPatients.findPatientDeathDate(input) - val expected: DataFrame = Seq( - ("Patient_01", Timestamp.valueOf("1980-04-01 00:00:00")) - ).toDF("patientID", "deathDate") + // Then + assertDSs(result, expected) + } + + "convert PatientMcotoPatient" should "return Dataset of Patients" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val input: Dataset[PatientMco] = Seq( + PatientMco("Patient_01", 9, "1", "2009", 1, null, Some(Timestamp.valueOf("2009-01-01 00:00:00"))), + PatientMco("Patient_02", 9, "3", "2000", 2, null, Some(Timestamp.valueOf("2000-03-01 00:00:00"))) + ).toDS() + + val expected: Dataset[Patient] = Seq( + Patient("Patient_01", 1, null, Some(Timestamp.valueOf("2009-01-01 00:00:00"))), + Patient("Patient_02", 2, null, Some(Timestamp.valueOf("2000-03-01 00:00:00"))) + ).toDS() // When - val result: DataFrame = input.getDeathDates(9).select(col("patientID"), col("deathDate")) + val result = McoPatients.fromDerivedPatienttoPatient(input) // Then - assertDFs(result, expected) + assertDSs(result, expected) } - "transform" should "return correct Dataset" in { + "getInput" should "read file" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val mco: DataFrame = Seq( - ("Patient_01", 1, 2, 1985), - ("Patient_02", 9, 3, 1986), - ("Patient_03", 9, 4, 1980), - ("Patient_03", 9, 4, 1984), - ("Patient_04", 3, 5, 1995) - ).toDF("NUM_ENQ", "MCO_B__SOR_MOD", "SOR_MOI", "SOR_ANN") - - val expected: DataFrame = Seq( - ("Patient_02", Timestamp.valueOf("1986-03-01 00:00:00")), - ("Patient_03", Timestamp.valueOf("1980-04-01 00:00:00")) - ).toDF("patientID", "deathDate") + val mco = spark.read.parquet("src/test/resources/test-input/MCO.parquet") + val sources = Sources(mco = Some(mco)) + + val expected: Dataset[PatientMco] = Seq( + PatientMco("Patient_02", 5, "2", "2007", 0, null, Some(Timestamp.valueOf("2007-02-01 00:00:00"))), + PatientMco("Patient_02", 5, "2", "2007", 0, null, Some(Timestamp.valueOf("2007-02-01 00:00:00"))), + PatientMco("Patient_02", 5, "1", "2006", 0, null, Some(Timestamp.valueOf("2006-01-01 00:00:00"))), + PatientMco("Patient_02", 5, "1", "2006", 0, null, Some(Timestamp.valueOf("2006-01-01 00:00:00"))), + PatientMco("Patient_02", 5, "3", "2008", 0, null, Some(Timestamp.valueOf("2008-03-01 00:00:00"))), + PatientMco("Patient_02", 5, "3", "2008", 0, null, Some(Timestamp.valueOf("2008-03-01 00:00:00"))) + ).toDS() + + // When + val result = McoPatients.getInput(sources) + + // Then + assertDSs(result, expected) + } + + "extract" should "build patients with actual data" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val mco = spark.read.parquet("src/test/resources/test-input/MCO.parquet") + val sources = Sources(mco = Some(mco)) + + val expected: Dataset[Patient] = Seq.empty[Patient].toDS() // When - val result = McoPatients.extract(mco) + val result = McoPatients.extract(sources) // Then - assertDFs(result.toDF, expected) + assertDSs(result, expected) } } \ No newline at end of file diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/McocePatientsSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/McocePatientsSuite.scala index a7a60212..36e17bfc 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/McocePatientsSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/patients/McocePatientsSuite.scala @@ -2,133 +2,184 @@ package fr.polytechnique.cmap.cnam.etl.extractors.patients -import org.apache.spark.sql.functions.lit -import org.apache.spark.sql.types.TimestampType +import java.sql.Timestamp +import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.extractors.patients.McocePatients.McocePatientsImplicit -import fr.polytechnique.cmap.cnam.etl.implicits +import fr.polytechnique.cmap.cnam.etl.patients.Patient import fr.polytechnique.cmap.cnam.etl.sources.Sources -import fr.polytechnique.cmap.cnam.study.fall.config.FallConfig -import fr.polytechnique.cmap.cnam.util.functions.makeTS class McocePatientsSuite extends SharedContext { - "calculateBirthYear" should "return a DataFrame with the birth year for each patient" in { - + "findPatientBirthDate" should "return a Dataset with the birth date for each patient" in { val sqlCtx = sqlContext import sqlCtx.implicits._ - //Given - val input = Seq( - ("200410", 1, 79, makeTS(2014, 4, 18)), - ("2004100010", 1, 73, makeTS(2014, 1, 9)), - ("2004100010", 1, 73, makeTS(2014, 2, 11)), - ("2004100010", 1, 74, makeTS(2014, 7, 18)), - ("2004100010", 1, 74, makeTS(2014, 12, 12)), - ("2004100010", 1, 74, makeTS(2014, 4, 15)), - ("2004100010", 1, 74, makeTS(2014, 10, 27)), - ("2004100010", 1, 74, makeTS(2014, 4, 4)), - ("2004100010", 1, 74, makeTS(2014, 11, 6)), - ("2004100010", 1, 74, makeTS(2014, 5, 2)), - ("2004100010", 1, 74, makeTS(2014, 9, 26)) - ).toDF("patientID", "sex", "age", "event_date") - - val expected = Seq( - ("200410", 1935), - ("2004100010", 1940) - ).toDF("patientID", "birth_year") - - //When - val result = input.calculateBirthYear - - //Then - assertDFs(result, expected) + // Given + val input: Dataset[PatientMcoce] = Seq( + PatientMcoce("200410", 1, 79, null, Timestamp.valueOf("2014-04-18 00:00:00"), None), + PatientMcoce("2004100010", 1, 73, null, Timestamp.valueOf("2014-01-09 00:00:00"), None), + PatientMcoce("2004100010", 1, 73, null, Timestamp.valueOf("2014-02-11 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, null, Timestamp.valueOf("2014-07-18 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, null, Timestamp.valueOf("2014-12-12 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, null, Timestamp.valueOf("2014-04-15 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, null, Timestamp.valueOf("2014-10-27 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, null, Timestamp.valueOf("2014-04-04 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, null, Timestamp.valueOf("2014-11-06 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, null, Timestamp.valueOf("2014-05-02 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, null, Timestamp.valueOf("2014-09-26 00:00:00"), None) + ).toDS() + + val expected: Dataset[PatientMcoce] = Seq( + PatientMcoce("200410", 1, 79, Timestamp.valueOf("1935-04-01 00:00:00"), Timestamp.valueOf("2014-04-18 00:00:00"), None), + PatientMcoce("2004100010", 1, 73, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-01-09 00:00:00"), None), + PatientMcoce("2004100010", 1, 73, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-02-11 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-07-18 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-12-12 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-04-15 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-10-27 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-04-04 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-11-06 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-05-02 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-09-26 00:00:00"), None) + ).toDS() + + // When + val result = McocePatients.findPatientBirthDate(input) + + // Then + assertDSs(result, expected) } - "groupByIdAndAge" should "return a DataFrame with data aggregated by patient ID and age" in { - + "findGender" should "return a Dataset with the correct gender for each patient" in { val sqlCtx = sqlContext import sqlCtx.implicits._ - //Given - val input = Seq( - ("200410", 1, 79, makeTS(2014, 4, 18)), - ("2004100010", 1, 73, makeTS(2014, 1, 9)), - ("2004100010", 1, 73, makeTS(2014, 2, 11)), - ("2004100010", 1, 74, makeTS(2014, 7, 18)), - ("2004100010", 1, 74, makeTS(2014, 12, 12)), - ("2004100010", 1, 74, makeTS(2014, 4, 15)), - ("2004100010", 1, 74, makeTS(2014, 10, 27)), - ("2004100010", 1, 74, makeTS(2014, 4, 4)), - ("2004100010", 1, 74, makeTS(2014, 11, 6)), - ("2004100010", 1, 74, makeTS(2014, 5, 2)), - ("2004100010", 1, 74, makeTS(2014, 9, 26)) - ).toDF("patientID", "sex", "age", "event_date") - - val expected = Seq( - ("200410", 79, 1.0, 1.0, makeTS(2014, 4, 18), makeTS(2014, 4, 18)), - ("2004100010", 73, 2.0, 2.0, makeTS(2014, 1, 9), makeTS(2014, 2, 11)), - ("2004100010", 74, 8.0, 8.0, makeTS(2014, 4, 4), makeTS(2014, 12, 12)) - ).toDF("patientID", "age", "sum_sex", "count_sex", "min_event_date", "max_event_date") - - //When - val result = input.groupByIdAndAge - - //Then - assertDFs(result, expected) - + // Given + val input: Dataset[PatientMcoce] = Seq( + PatientMcoce("200410", 1, 79, Timestamp.valueOf("1935-04-01 00:00:00"), Timestamp.valueOf("2014-04-18 00:00:00"), None), + PatientMcoce("2004100010", 1, 73, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-01-09 00:00:00"), None), + PatientMcoce("2004100010", 1, 73, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-02-11 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-07-18 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-12-12 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-04-15 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-10-27 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-04-04 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-11-06 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-05-02 00:00:00"), None), + PatientMcoce("2004100010", 2, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-09-26 00:00:00"), None) + ).toDS() + + val expected: Dataset[PatientMcoce] = Seq( + PatientMcoce("200410", 1, 79, Timestamp.valueOf("1935-04-01 00:00:00"), Timestamp.valueOf("2014-04-18 00:00:00"), None), + PatientMcoce("2004100010", 1, 73, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-01-09 00:00:00"), None), + PatientMcoce("2004100010", 1, 73, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-02-11 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-07-18 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-12-12 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-04-15 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-10-27 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-04-04 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-11-06 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-05-02 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, Timestamp.valueOf("1940-03-01 00:00:00"), Timestamp.valueOf("2014-09-26 00:00:00"), None) + ).toDS() + + // When + val result = McocePatients.findPatientGender(input) + + // Then + assertDSs(result, expected) } - "calculateBirthDateAndGender" should "return a DataFrame with aggregated data" in { - + "findDeathDate" should "return the same dataset" in { val sqlCtx = sqlContext import sqlCtx.implicits._ - //Given - val input = Seq( - ("200410", 79, 1.0, 1.0, makeTS(2014, 4, 18), makeTS(2014, 4, 18), 1935), - ("2004100010", 73, 2.0, 2.0, makeTS(2014, 1, 9), makeTS(2014, 2, 11), 1940), - ("2004100010", 74, 8.0, 8.0, makeTS(2014, 4, 4), makeTS(2014, 12, 12), 1940) - ).toDF("patientID", "age", "sum_sex", "count_sex", "min_event_date", "max_event_date", "birth_year") - val expected = Seq( - ("200410", 1, makeTS(1935, 4, 1)), - ("2004100010", 1, makeTS(1940, 3, 1)) - ).toDF("patientID", "gender", "birthDate") + // Given + val input: Dataset[PatientMcoce] = Seq( + PatientMcoce("Patient_01", 2, 45, Timestamp.valueOf("1975-01-01 00:00:00"), Timestamp.valueOf("2020-01-01 00:00:00"), None), + PatientMcoce("Patient_02", 1, 50, Timestamp.valueOf("1959-01-01 00:00:00"), Timestamp.valueOf("2010-01-01 00:00:00"), None) + ).toDS() - //When - val result = input.calculateBirthDateAndGender + val expected: Dataset[PatientMcoce] = Seq( + PatientMcoce("Patient_01", 2, 45, Timestamp.valueOf("1975-01-01 00:00:00"), Timestamp.valueOf("2020-01-01 00:00:00"), None), + PatientMcoce("Patient_02", 1, 50, Timestamp.valueOf("1959-01-01 00:00:00"), Timestamp.valueOf("2010-01-01 00:00:00"), None) + ).toDS() - //Then - assertDFs(result, expected) + // When + val result: Dataset[PatientMcoce] = McocePatients.findPatientDeathDate(input) + // Then + assertDSs(result, expected) } - "extract" should "return the correct data in a Dataset[Patient] for a known input" in { + "convert PatientMcocetoPatient" should "return Dataset of Patients" in { val sqlCtx = sqlContext import sqlCtx.implicits._ - //Given - val fallConfig = FallConfig.load("", "test") - val patientsConfig = PatientsConfig(fallConfig.base.studyStart) - import implicits.SourceReader - val sources = Sources.sanitize(sqlContext.readSources(fallConfig.input)) - val expected = Seq( - ("200410", 1, makeTS(1935, 4, 1)), - ("2004100010", 1, makeTS(1940, 3, 1)) - ).toDF("patientID", "gender", "birthDate") - .withColumn("deathDate", lit(null).cast(TimestampType)) + // Given + val input: Dataset[PatientMcoce] = Seq( + PatientMcoce("Patient_01", 2, 45, Timestamp.valueOf("1975-01-01 00:00:00"), Timestamp.valueOf("2020-01-01 00:00:00"), None), + PatientMcoce("Patient_02", 1, 50, Timestamp.valueOf("1959-01-01 00:00:00"), Timestamp.valueOf("2010-01-01 00:00:00"), None) + ).toDS() + val expected: Dataset[Patient] = Seq( + Patient("Patient_01", 2, Timestamp.valueOf("1975-01-01 00:00:00"), None), + Patient("Patient_02", 1, Timestamp.valueOf("1959-01-01 00:00:00"), None) + ).toDS() - //When - val result = McocePatients.extract( - sources.mcoCe.get, patientsConfig.minGender, - patientsConfig.maxGender, patientsConfig.minYear, patientsConfig.maxYear - ).toDF() + // When + val result = McocePatients.fromDerivedPatienttoPatient(input) - //Then - assertDFs(result, expected) + // Then + assertDSs(result, expected) + } + + "getInput" should "read file" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ + // Given + val mcoce = spark.read.parquet("src/test/resources/test-input/MCO_CE.parquet") + val sources = Sources(mcoCe = Some(mcoce)) + + val expected: Dataset[PatientMcoce] = Seq( + PatientMcoce("200410", 1, 79, null, Timestamp.valueOf("2014-04-18 00:00:00"), None), + PatientMcoce("2004100010", 1, 73, null, Timestamp.valueOf("2014-01-09 00:00:00"), None), + PatientMcoce("2004100010", 1, 73, null, Timestamp.valueOf("2014-02-11 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, null, Timestamp.valueOf("2014-07-18 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, null, Timestamp.valueOf("2014-12-12 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, null, Timestamp.valueOf("2014-04-15 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, null, Timestamp.valueOf("2014-10-27 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, null, Timestamp.valueOf("2014-04-04 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, null, Timestamp.valueOf("2014-11-06 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, null, Timestamp.valueOf("2014-05-02 00:00:00"), None), + PatientMcoce("2004100010", 1, 74, null, Timestamp.valueOf("2014-09-26 00:00:00"), None) + ).toDS() + + // When + val result = McocePatients.getInput(sources) + + // Then + assertDSs(result, expected) } + "extract" should "build patients with actual data" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val mcoce = spark.read.parquet("src/test/resources/test-input/MCO_CE.parquet") + val sources = Sources(mcoCe = Some(mcoce)) + val expected: Dataset[Patient] = Seq( + Patient("200410", 1, Timestamp.valueOf("1935-04-01 00:00:00"), None), + Patient("2004100010", 1, Timestamp.valueOf("1940-03-01 00:00:00"), None) + ).toDS() + + // When + val result = McocePatients.extract(sources) + + // Then + assertDSs(result, expected) + } } diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/prestations/PractitionerClaimSpecialityExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/prestations/PractitionerClaimSpecialityExtractorSuite.scala deleted file mode 100644 index 13184472..00000000 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/prestations/PractitionerClaimSpecialityExtractorSuite.scala +++ /dev/null @@ -1,82 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors.prestations - -import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema -import org.apache.spark.sql.types.{StringType, StructField, StructType} -import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.events.{Event, MedicalPractitionerClaim, NonMedicalPractitionerClaim, PractitionerClaimSpeciality} -import fr.polytechnique.cmap.cnam.etl.sources.Sources -import fr.polytechnique.cmap.cnam.util.functions.makeTS - -class PractitionerClaimSpecialityExtractorSuite extends SharedContext { - - "extract" should "extract health care related services provided by medical practitioner raw data" in { - - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - // Given - val medicalSpeCodes = List("42") - val input = spark.read.parquet("src/test/resources/test-input/DCIR.parquet") - val sources = Sources(dcir = Some(input)) - - val expected = Seq[Event[PractitionerClaimSpeciality]]( - MedicalPractitionerClaim("Patient_01", "A10000001", "42", makeTS(2006, 2, 1)), - MedicalPractitionerClaim("Patient_01", "A10000001", "42", makeTS(2006, 1, 15)), - MedicalPractitionerClaim("Patient_01", "A10000001", "42", makeTS(2006, 1, 30)) - ).toDS - - - // When - val result = MedicalPractitionerClaimExtractor.extract(sources, medicalSpeCodes.toSet) - - // Then - assertDSs(result, expected) - } - - - "extract" should "extract health care related services provided by non medical practitioner raw data" in { - - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - // Given - val nonMedicalSpeCodes = List() - val input = spark.read.parquet("src/test/resources/test-input/DCIR.parquet") - val sources = Sources(dcir = Some(input)) - - val expected = Seq[Event[PractitionerClaimSpeciality]]( - NonMedicalPractitionerClaim("Patient_01", "A10000001", "42", makeTS(2006, 2, 1)), - NonMedicalPractitionerClaim("Patient_01", "A10000001", "42", makeTS(2006, 1, 15)), - NonMedicalPractitionerClaim("Patient_01", "A10000001", "42", makeTS(2006, 1, 30)), - NonMedicalPractitionerClaim("Patient_02", "A10000005", "0", makeTS(2006, 1, 5)), - NonMedicalPractitionerClaim("Patient_02", "A10000005", "42", makeTS(2006, 1, 15)), - NonMedicalPractitionerClaim("Patient_02", "A10000005", "42", makeTS(2006, 1, 30)) - ).toDS - - - // When - val result = NonMedicalPractitionerClaimExtractor.extract(sources, nonMedicalSpeCodes.toSet) - - // Then - assertDSs(result, expected) - } - - "extractGroupId" should "return the health care practitioner ID" in { - - // Given - val schema = StructType( - StructField("PFS_EXE_NUM", StringType) :: Nil - ) - val values = Array[Any]("A10000001") - val row = new GenericRowWithSchema(values, schema) - val expected = "A10000001" - - // When - val result = NonMedicalPractitionerClaimExtractor.extractGroupId(row) - - // Then - assert(result == expected) - } -} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/dcir/DcirRowExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/dcir/DcirRowExtractorSuite.scala new file mode 100644 index 00000000..77f9a0ce --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/dcir/DcirRowExtractorSuite.scala @@ -0,0 +1,97 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.sources.dcir + +import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema +import org.apache.spark.sql.types.{DateType, StringType, StructField, StructType} +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +class DcirRowExtractorSuite extends SharedContext { + + object MockDcirRowExtractor extends DcirRowExtractor + + "extractGroupId" should "return the groupID" in { + // Given + val schema = StructType( + Seq( + StructField("FLX_DIS_DTD", StringType), + StructField("FLX_TRT_DTD", StringType), + StructField("FLX_EMT_TYP", StringType), + StructField("FLX_EMT_NUM", StringType), + StructField("FLX_EMT_ORD", StringType), + StructField("ORG_CLE_NUM", StringType), + StructField("DCT_ORD_NUM", StringType) + ) + ) + + val values = Array[Any]("2014-08-01", "2014-07-17", "1", "17", "0", "01C673000", "1749") + val r = new GenericRowWithSchema(values, schema) + val expected = "MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMwMDBfMTc0OQ==" + + // When + val result = MockDcirRowExtractor.extractGroupId(r) + + // Then + assert(result == expected) + } + + "extractPatientId" should "return the patientId" in { + // Given + val schema = StructType( + Seq( + StructField("NUM_ENQ", StringType) + ) + ) + + val values = Array[Any]("Patient") + val r = new GenericRowWithSchema(values, schema) + val expected = "Patient" + + // When + val result = MockDcirRowExtractor.extractPatientId(r) + + // Then + assert(result == expected) + } + + "extractStart" should "return the start date" in { + // Given + val schema = StructType( + Seq( + StructField("EXE_SOI_DTD", DateType), + StructField("FLX_DIS_DTD", DateType) + ) + ) + + val values = Array[Any](makeTS(2014, 8, 1), makeTS(2014, 7, 17)) + val r = new GenericRowWithSchema(values, schema) + val expected = makeTS(2014, 8, 1) + + // When + val result = MockDcirRowExtractor.extractStart(r) + + // Then + assert(result == expected) + } + + it should "return the flow date when the start date is null" in { + // Given + val schema = StructType( + Seq( + StructField("EXE_SOI_DTD", DateType), + StructField("FLX_DIS_DTD", DateType) + ) + ) + + val values = Array[Any](null, makeTS(2014, 7, 1)) + val r = new GenericRowWithSchema(values, schema) + val expected = makeTS(2014, 7, 1) + + // When + val result = MockDcirRowExtractor.extractStart(r) + + // Then + assert(result == expected) + } +} \ No newline at end of file diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/had/HadRowExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/had/HadRowExtractorSuite.scala new file mode 100644 index 00000000..dd6bb99b --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/had/HadRowExtractorSuite.scala @@ -0,0 +1,72 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.sources.had + +import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema +import org.apache.spark.sql.types.{DateType, IntegerType, StringType, StructField, StructType} +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +class HadRowExtractorSuite extends SharedContext { + + object MockHadRowExtractor extends HadRowExtractor + + "extractGroupId" should "return the groupID" in { + // Given + val schema = StructType( + Seq( + StructField("ETA_NUM_EPMSI", StringType), + StructField("RHAD_NUM", StringType), + StructField("year", IntegerType) + ) + ) + + val values = Array[Any]("A", "B", 2000) + val r = new GenericRowWithSchema(values, schema) + val expected = "A_B_2000" + + // When + val result = MockHadRowExtractor.extractGroupId(r) + + // Then + assert(result == expected) + } + + "extractPatientId" should "return the patientId" in { + // Given + val schema = StructType( + Seq( + StructField("NUM_ENQ", StringType) + ) + ) + + val values = Array[Any]("Patient") + val r = new GenericRowWithSchema(values, schema) + val expected = "Patient" + + // When + val result = MockHadRowExtractor.extractPatientId(r) + + // Then + assert(result == expected) + } + + "extractStart" should "return the start date" in { + // Given + val schema = StructType( + Seq( + StructField("estimated_start", DateType) + ) + ) + + val values = Array[Any](makeTS(2014, 8, 1)) + val r = new GenericRowWithSchema(values, schema) + val expected = makeTS(2014, 8, 1) + + // When + val result = MockHadRowExtractor.extractStart(r) + + // Then + assert(result == expected) + } +} \ No newline at end of file diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/had/HadSourceSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/had/HadSourceSuite.scala similarity index 97% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/had/HadSourceSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/had/HadSourceSuite.scala index 799a9678..e5e7cf1a 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/had/HadSourceSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/had/HadSourceSuite.scala @@ -1,9 +1,9 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.had +package fr.polytechnique.cmap.cnam.etl.extractors.sources.had -import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.util.functions.makeTS import org.apache.spark.sql.DataFrame import org.apache.spark.sql.functions._ +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.util.functions.makeTS class HadSourceSuite extends SharedContext with HadSource { diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mco/McoRowExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mco/McoRowExtractorSuite.scala new file mode 100644 index 00000000..9b5d86d1 --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mco/McoRowExtractorSuite.scala @@ -0,0 +1,72 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.sources.mco + +import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema +import org.apache.spark.sql.types.{DateType, IntegerType, StringType, StructField, StructType} +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +class McoRowExtractorSuite extends SharedContext { + + object MockMcoRowExtractor extends McoRowExtractor + + "extractGroupId" should "return the groupID" in { + // Given + val schema = StructType( + Seq( + StructField("ETA_NUM", StringType), + StructField("RSA_NUM", StringType), + StructField("SOR_ANN", IntegerType) + ) + ) + + val values = Array[Any]("A", "B", 2000) + val r = new GenericRowWithSchema(values, schema) + val expected = "A_B_2000" + + // When + val result = MockMcoRowExtractor.extractGroupId(r) + + // Then + assert(result == expected) + } + + "extractPatientId" should "return the patientId" in { + // Given + val schema = StructType( + Seq( + StructField("NUM_ENQ", StringType) + ) + ) + + val values = Array[Any]("Patient") + val r = new GenericRowWithSchema(values, schema) + val expected = "Patient" + + // When + val result = MockMcoRowExtractor.extractPatientId(r) + + // Then + assert(result == expected) + } + + "extractStart" should "return the start date" in { + // Given + val schema = StructType( + Seq( + StructField("estimated_start", DateType) + ) + ) + + val values = Array[Any](makeTS(2014, 8, 1)) + val r = new GenericRowWithSchema(values, schema) + val expected = makeTS(2014, 8, 1) + + // When + val result = MockMcoRowExtractor.extractStart(r) + + // Then + assert(result == expected) + } +} \ No newline at end of file diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/mco/McoSourceSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mco/McoSourceSuite.scala similarity index 93% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/mco/McoSourceSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mco/McoSourceSuite.scala index 45c8e95d..8c9eafa1 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/mco/McoSourceSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mco/McoSourceSuite.scala @@ -1,6 +1,6 @@ // License: BSD 3 clause -package fr.polytechnique.cmap.cnam.etl.extractors.mco +package fr.polytechnique.cmap.cnam.etl.extractors.sources.mco import org.apache.spark.sql.DataFrame import org.apache.spark.sql.functions._ @@ -19,7 +19,7 @@ class McoSourceSuite extends SharedContext with McoSource { Some(makeTS(2011, 12, 1)), Some(makeTS(2011, 12, 12)), Some("01122011"), Some("12122011")), ("HasCancer1", Some("C679"), Some("C691"), Some("C643"), Some(0), Some(12), Some(2011), 11, Some(makeTS(2011, 12, 1)), Some(makeTS(2011, 12, 12)), Some("01122011"), Some("12122011")), - ("HasCancer2", Some("C669"), Some("C672"), Some("C643"), Some(0),Some(12), Some(2011), 11, + ("HasCancer2", Some("C669"), Some("C672"), Some("C643"), Some(0), Some(12), Some(2011), 11, None, Some(makeTS(2011, 12, 12)), None, Some("12122011")), ("HasCancer3", Some("C669"), Some("C672"), Some("C643"), Some(0), Some(12), Some(2011), 11, None, None, None, None), @@ -27,7 +27,7 @@ class McoSourceSuite extends SharedContext with McoSource { None, Some(makeTS(2011, 12, 12)), None, Some("12122011")), ("HasCancer5", Some("C679"), Some("B672"), Some("C673"), Some(0), Some(1), Some(2010), 31, Some(makeTS(2011, 12, 1)), Some(makeTS(2011, 12, 12)), Some("01122011"), Some("12122011")), - ("MustBeDropped1", None, None, None, Some(0), Some(1), Some(2010), 31, + ("MustBeDropped1", None, None, None, Some(0), Some(1), Some(2010), 31, Some(makeTS(2011, 12, 1)), Some(makeTS(2011, 12, 12)), Some("01122011"), Some("12122011")), ("MustBeDropped2", None, Some("7"), None, Some(0), Some(1), Some(2010), 31, Some(makeTS(2011, 12, 1)), Some(makeTS(2011, 12, 12)), Some("01122011"), Some("12122011")) diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mcoce/McoCeRowExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mcoce/McoCeRowExtractorSuite.scala new file mode 100644 index 00000000..156f496e --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/mcoce/McoCeRowExtractorSuite.scala @@ -0,0 +1,72 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.sources.mcoce + +import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema +import org.apache.spark.sql.types.{DateType, IntegerType, StringType, StructField, StructType} +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +class McoCeRowExtractorSuite extends SharedContext { + + object MockMcoCeRowExtractor extends McoCeRowExtractor + + "extractGroupId" should "return the groupID" in { + // Given + val schema = StructType( + Seq( + StructField("ETA_NUM", StringType), + StructField("SEQ_NUM", StringType), + StructField("year", IntegerType) + ) + ) + + val values = Array[Any]("A", "B", 2000) + val r = new GenericRowWithSchema(values, schema) + val expected = "A_B_2000" + + // When + val result = MockMcoCeRowExtractor.extractGroupId(r) + + // Then + assert(result == expected) + } + + "extractPatientId" should "return the patientId" in { + // Given + val schema = StructType( + Seq( + StructField("NUM_ENQ", StringType) + ) + ) + + val values = Array[Any]("Patient") + val r = new GenericRowWithSchema(values, schema) + val expected = "Patient" + + // When + val result = MockMcoCeRowExtractor.extractPatientId(r) + + // Then + assert(result == expected) + } + + "extractStart" should "return the start date" in { + // Given + val schema = StructType( + Seq( + StructField("EXE_SOI_DTD", DateType) + ) + ) + + val values = Array[Any](makeTS(2014, 8, 1)) + val r = new GenericRowWithSchema(values, schema) + val expected = makeTS(2014, 8, 1) + + // When + val result = MockMcoCeRowExtractor.extractStart(r) + + // Then + assert(result == expected) + } +} \ No newline at end of file diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssr/SsrRowExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssr/SsrRowExtractorSuite.scala new file mode 100644 index 00000000..2042f5f1 --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssr/SsrRowExtractorSuite.scala @@ -0,0 +1,73 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.sources.ssr + +import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema +import org.apache.spark.sql.types.{DateType, IntegerType, StringType, StructField, StructType} +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +class SsrRorwExtractorSuite extends SharedContext { + + object MockSsrRowExtractor extends SsrRowExtractor + + "extractGroupId" should "return the groupID" in { + // Given + val schema = StructType( + Seq( + StructField("ETA_NUM", StringType), + StructField("RHA_NUM", StringType), + StructField("RHS_NUM", StringType), + StructField("year", IntegerType) + ) + ) + + val values = Array[Any]("A", "B", "C", 2000) + val r = new GenericRowWithSchema(values, schema) + val expected = "A_B_C_2000" + + // When + val result = MockSsrRowExtractor.extractGroupId(r) + + // Then + assert(result == expected) + } + + "extractPatientId" should "return the patientId" in { + // Given + val schema = StructType( + Seq( + StructField("NUM_ENQ", StringType) + ) + ) + + val values = Array[Any]("Patient") + val r = new GenericRowWithSchema(values, schema) + val expected = "Patient" + + // When + val result = MockSsrRowExtractor.extractPatientId(r) + + // Then + assert(result == expected) + } + + "extractStart" should "return the start date" in { + // Given + val schema = StructType( + Seq( + StructField("estimated_start", DateType) + ) + ) + + val values = Array[Any](makeTS(2014, 8, 1)) + val r = new GenericRowWithSchema(values, schema) + val expected = makeTS(2014, 8, 1) + + // When + val result = MockSsrRowExtractor.extractStart(r) + + // Then + assert(result == expected) + } +} \ No newline at end of file diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/ssr/SsrSourceSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssr/SsrSourceSuite.scala similarity index 50% rename from src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/ssr/SsrSourceSuite.scala rename to src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssr/SsrSourceSuite.scala index 32599047..a0104350 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/ssr/SsrSourceSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssr/SsrSourceSuite.scala @@ -1,9 +1,8 @@ -package fr.polytechnique.cmap.cnam.etl.extractors.ssr +package fr.polytechnique.cmap.cnam.etl.extractors.sources.ssr +import org.apache.spark.sql.DataFrame import fr.polytechnique.cmap.cnam.SharedContext import fr.polytechnique.cmap.cnam.util.functions.makeTS -import org.apache.spark.sql.DataFrame -import org.apache.spark.sql.functions._ class SsrSourceSuite extends SharedContext with SsrSource { @@ -11,21 +10,21 @@ class SsrSourceSuite extends SharedContext with SsrSource { val fakeSsrData = { - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - Seq( - ("Patient1", Some("C669"), Some("C672"), Some("C643"), Some("12122011"), Some(2011), Some(12)), - ("Patient1", Some("C679"), Some("C691"), Some("C643"), Some("01122011"), Some(2011), Some(12)), - ("Patient1", Some("C679"), Some("C691"), Some("C643"), Some("15012012"), Some(2012), Some(1)), - ("Patient2", Some("C669"), Some("C672"), Some("C643"), None, Some(2011), Some(11)), - ("Patient3", Some("C679"), Some("B672"), Some("C673"), None, Some(2011), Some(5)), - ("MustBeDropped1", None, None, None, Some("31122011"), Some(2011), Some(12)) - ).toDF( - "SSR_C__NUM_ENQ", "MOR_PRP", "ETL_AFF", "SSR_D__DGN_COD", - "SSR_C__ENT_DAT", "SSR_C__ANN_LUN_1S", "SSR_C__MOI_LUN_1S" - ) - } + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + Seq( + ("Patient1", Some("C669"), Some("C672"), Some("C643"), Some("12122011"), Some(2011), Some(12)), + ("Patient1", Some("C679"), Some("C691"), Some("C643"), Some("01122011"), Some(2011), Some(12)), + ("Patient1", Some("C679"), Some("C691"), Some("C643"), Some("15012012"), Some(2012), Some(1)), + ("Patient2", Some("C669"), Some("C672"), Some("C643"), None, Some(2011), Some(11)), + ("Patient3", Some("C679"), Some("B672"), Some("C673"), None, Some(2011), Some(5)), + ("MustBeDropped1", None, None, None, Some("31122011"), Some(2011), Some(12)) + ).toDF( + "NUM_ENQ", "SSR_B__MOR_PRP", "SSR_B__ETL_AFF", "SSR_D__DGN_COD", + "ENT_DAT", "ANN_LUN_1S", "MOI_LUN_1S" + ) + } val sqlCtx = sqlContext import sqlCtx.implicits._ diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssrce/SsrCeRowExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssrce/SsrCeRowExtractorSuite.scala new file mode 100644 index 00000000..67d65137 --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/sources/ssrce/SsrCeRowExtractorSuite.scala @@ -0,0 +1,51 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.extractors.sources.ssrce + +import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema +import org.apache.spark.sql.types.{DateType, StringType, StructField, StructType} +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +class SsrCeRowExtractorSuite extends SharedContext { + + object MockSsrCeRowExtractor extends SsrCeRowExtractor + + "extractPatientId" should "return the patientId" in { + // Given + val schema = StructType( + Seq( + StructField("NUM_ENQ", StringType) + ) + ) + + val values = Array[Any]("Patient") + val r = new GenericRowWithSchema(values, schema) + val expected = "Patient" + + // When + val result = MockSsrCeRowExtractor.extractPatientId(r) + + // Then + assert(result == expected) + } + + "extractStart" should "return the start date" in { + // Given + val schema = StructType( + Seq( + StructField("EXE_SOI_DTD", DateType) + ) + ) + + val values = Array[Any](makeTS(2014, 8, 1)) + val r = new GenericRowWithSchema(values, schema) + val expected = makeTS(2014, 8, 1) + + // When + val result = MockSsrCeRowExtractor.extractStart(r) + + // Then + assert(result == expected) + } +} \ No newline at end of file diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/tracklosses/TracklossesSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/tracklosses/TracklossesSuite.scala deleted file mode 100644 index d330a347..00000000 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/extractors/tracklosses/TracklossesSuite.scala +++ /dev/null @@ -1,104 +0,0 @@ -// License: BSD 3 clause - -package fr.polytechnique.cmap.cnam.etl.extractors.tracklosses - -import org.apache.spark.sql.DataFrame -import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.events.Trackloss -import fr.polytechnique.cmap.cnam.etl.sources.Sources -import fr.polytechnique.cmap.cnam.util.functions._ - -class TracklossesSuite extends SharedContext { - - "withInterval" should "add the number of month before the next prescription" in { - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - // Given - val input = Seq( - ("Patient_01", makeTS(2006, 1, 5)), - ("Patient_01", makeTS(2006, 11, 5)), - ("Patient_01", makeTS(2007, 2, 5)) - ).toDF("patientID", "eventDate") - - val expected = Seq( - ("Patient_01", makeTS(2006, 1, 5), 10), - ("Patient_01", makeTS(2006, 11, 5), 3), - ("Patient_01", makeTS(2007, 2, 5), 34) - ).toDF("patientID", "eventDate", "interval") - - // When - import Tracklosses.TracklossesDataFrame - val result = input.withInterval(makeTS(2009, 12, 31)) - - // Then - assertDFs(result, expected) - } - - "filterTrackLosses" should "remove any line with small interval" in { - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - // Given - val input = Seq( - ("Patient_01", makeTS(2006, 1, 5), 10), - ("Patient_01", makeTS(2006, 11, 5), 3), - ("Patient_01", makeTS(2007, 2, 5), 34) - ).toDF("patientID", "eventDate", "interval") - - val expected = Seq( - ("Patient_01", makeTS(2006, 1, 5), 10), - ("Patient_01", makeTS(2007, 2, 5), 34) - ).toDF("patientID", "eventDate", "interval") - - // When - import Tracklosses.TracklossesDataFrame - val result = input.filterTrackLosses(4) - - // Then - assertDFs(result, expected) - } - - "withTrackLossDate" should "add the date of the trackloss" in { - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - // Given - val input = Seq( - ("Patient_01", makeTS(2006, 1, 5), 10), - ("Patient_01", makeTS(2007, 2, 5), 34) - ).toDF("patientID", "eventDate", "interval") - - val expected = Seq( - ("Patient_01", makeTS(2006, 1, 5), 10, makeTS(2006, 3, 5)), - ("Patient_01", makeTS(2007, 2, 5), 34, makeTS(2007, 4, 5)) - ).toDF("patientID", "eventDate", "interval", "tracklossDate") - - // When - import Tracklosses.TracklossesDataFrame - val result = input.withTrackLossDate(2) - - // Then - assertDFs(result, expected) - } - - "extract" should "return correct result" in { - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - // Given - val config = TracklossesConfig(makeTS(2006, 12, 31)) - val dcir: DataFrame = sqlContext.read.load("src/test/resources/test-input/DCIR.parquet") - val sources = new Sources(dcir = Some(dcir)) - val expected: DataFrame = Seq( - Trackloss("Patient_01", makeTS(2006, 3, 30)), - Trackloss("Patient_02", makeTS(2006, 3, 30)) - ).toDF - - // When - val result = new Tracklosses(config).extract(sources) - - // Then - assertDFs(result.toDF(), expected) - } -} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/sources/data/HadSourceSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/sources/data/HadSourceSuite.scala index 9274e78e..e28694c0 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/sources/data/HadSourceSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/sources/data/HadSourceSuite.scala @@ -13,30 +13,20 @@ class HadSourceSuite extends SharedContext { HadSource.SEJ_RET, HadSource.FHO_RET, HadSource.PMS_RET, - HadSource.DAT_RET + HadSource.DAT_RET, + HadSource.ETA_NUM_EPMSI ).map(col => col.toString) val input = Seq( - ("1", "1", "1", "1", "1"), - ("1", "1", "1", "1", "1"), - ("1", "0", "0", "0", "0"), - ("0", "0", "0", "0", "0"), - ("0", "0", "0", "0", "0"), - ("0", "0", "0", "0", "0"), - ("0", "0", "0", "0", "0"), - ("0", "0", "0", "0", "0"), - ("0", "0", "0", "0", "0"), - ("0", "0", "0", "0", "0") + ("1", "0", "0", "0", "0", "100000000"), + ("1", "1", "0", "0", "0", "100000001"), + ("0", "0", "0", "0", "0", "100000001"), + ("0", "0", "0", "0", "0", "910100015") + ).toDF(colNames: _*) val expected = Seq( - ("0", "0", "0", "0", "0"), - ("0", "0", "0", "0", "0"), - ("0", "0", "0", "0", "0"), - ("0", "0", "0", "0", "0"), - ("0", "0", "0", "0", "0"), - ("0", "0", "0", "0", "0"), - ("0", "0", "0", "0", "0") + ("0", "0", "0", "0", "0", "100000001") ).toDF(colNames: _*) // When diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/sources/data/SsrFiltersSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/sources/data/SsrFiltersSuite.scala index 6b332fe1..be45b2c1 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/sources/data/SsrFiltersSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/sources/data/SsrFiltersSuite.scala @@ -14,20 +14,20 @@ class SsrFiltersSuite extends SharedContext { SsrSource.SEJ_RET, SsrSource.FHO_RET, SsrSource.PMS_RET, - SsrSource.DAT_RET + SsrSource.DAT_RET, + SsrSource.GRG_GME ).map(col => col.toString) val input = Seq( - ("0", "0", "0", "0", "0"), - ("1", "1", "1", "1", "1"), - ("0", "0", "0", "0", "0"), - ("1", "0", "0", "0", "0") + ("0", "0", "0", "0", "0", "900000"), + ("1", "1", "1", "1", "1", "600000"), + ("0", "0", "0", "0", "0", "800000"), + ("1", "0", "0", "0", "0", "900000") ).toDF(colNames: _*) val expected = Seq( - ("0", "0", "0", "0", "0"), - ("0", "0", "0", "0", "0") //filtered + ("0", "0", "0", "0", "0", "800000") ).toDF(colNames: _*) // When diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/sources/data/SsrSourceSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/sources/data/SsrSourceSuite.scala deleted file mode 100644 index d3465571..00000000 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/sources/data/SsrSourceSuite.scala +++ /dev/null @@ -1,74 +0,0 @@ -package fr.polytechnique.cmap.cnam.etl.sources.data - -import fr.polytechnique.cmap.cnam.SharedContext - -class SsrSourceSuite extends SharedContext { - "sanitize" should "return lines that are not corrupted" in { - val sqlCtx = sqlContext - import sqlCtx.implicits._ - - // Given - val colNames = List( - SsrSource.ETA_NUM, - SsrSource.RHA_NUM, - SsrSource.RHS_NUM, - SsrSource.MOR_PRP, - SsrSource.ETL_AFF, - SsrSource.MOI_ANN_SOR_SEJ, - SsrSource.RHS_ANT_SEJ_ENT, - SsrSource.FP_PEC, - SsrSource.NIR_RET, - SsrSource.SEJ_RET, - SsrSource.FHO_RET, - SsrSource.PMS_RET, - SsrSource.DAT_RET, - SsrSource.ENT_DAT, - SsrSource.SOR_DAT, - SsrSource.Year - - ).map(col => col.toString) - - val input = Seq( - ("10000123", "20000123", "123", "C66", "C24", "200910", "14", "Z15", "1", "1", "1", "1", "1", "14062008", "25082008", "2008"), - ("10000123", "20000123", "123", "C66", "C24", "200910", "14", "Z15", "1", "1", "1", "1", "1", "14062008", "25082008", "2008"), - ("10000123", "20000123", "123", "C66", "C24", "200910", "14", "Z15", "1", "0", "0", "0", "0", "14062008", "25082008", "2008"), - ("10000123", "20000123", "124", "C66", "C24", "200910", "14", "Z15", "0", "0", "0", "0", "0", "14062008", "25082008", "2008"), - ("10000123", "20000123", "125", "C66", "C24", "200910", "14", "Z15", "0", "0", "0", "0", "0", "14062008", "25082008", "2008"), - ("10000123", "20000123", "126", "C66", "C24", "200910", "14", "Z15", "0", "0", "0", "0", "0", "14062008", "25082008", "2008"), - ("10000123", "20000123", "127", "C66", "C24", "200910", "14", "Z15", "0", "0", "0", "0", "0", "14062008", "25082008", "2008"), - ("10000123", "20000123", "128", "C66", "C24", "200910", "14", "Z15", "0", "0", "0", "0", "0", "14062008", "25082008", "2008"), - ("10000123", "20000123", "129", "C66", "C24", "200910", "14", "Z15", "0", "0", "0", "0", "0", "14062008", "25082008", "2008"), - ("10000123", "20000123", "130", "C66", "C24", "200910", "14", "Z15", "0", "0", "0", "0", "0", "14062008", "25082008", "2008") - ).toDF(colNames: _*) - - - val expected = Seq( - ("10000123", "20000123", "124", "C66", "C24", "200910", "14", "Z15", "0", "0", "0", "0", "0", "14062008", "25082008", "2008"), - ("10000123", "20000123", "125", "C66", "C24", "200910", "14", "Z15", "0", "0", "0", "0", "0", "14062008", "25082008", "2008"), - ("10000123", "20000123", "126", "C66", "C24", "200910", "14", "Z15", "0", "0", "0", "0", "0", "14062008", "25082008", "2008"), - ("10000123", "20000123", "127", "C66", "C24", "200910", "14", "Z15", "0", "0", "0", "0", "0", "14062008", "25082008", "2008"), - ("10000123", "20000123", "128", "C66", "C24", "200910", "14", "Z15", "0", "0", "0", "0", "0", "14062008", "25082008", "2008"), - ("10000123", "20000123", "129", "C66", "C24", "200910", "14", "Z15", "0", "0", "0", "0", "0", "14062008", "25082008", "2008"), - ("10000123", "20000123", "130", "C66", "C24", "200910", "14", "Z15", "0", "0", "0", "0", "0", "14062008", "25082008", "2008") - ).toDF(colNames: _*) - - // When - val result = SsrSource.sanitize(input) - - // Then - assertDFs(result, expected) - } - - "readAnnotateJoin" should "return annotated joined SSR given SSR_C and SSR_SEJ" in { - val sqlCtx = sqlContext - - val ssrSejPath = "src/test/resources/test-input/SSR_SEJ.parquet" - val ssrCPath = "src/test/resources/test-input/SSR_C.parquet" - val expected = sqlCtx.read.parquet("src/test/resources/test-joined/SSR.parquet") - val result = SsrSource.read( - sqlCtx, - List(ssrSejPath, ssrCPath)) - - assertDFs(result, expected) - } -} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/drugprescription/DrugPrescriptionTransformerSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/drugprescription/DrugPrescriptionTransformerSuite.scala new file mode 100644 index 00000000..5b325605 --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/drugprescription/DrugPrescriptionTransformerSuite.scala @@ -0,0 +1,59 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.transformers.drugprescription + +import org.apache.spark.sql.Dataset +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.events.{Drug, DrugPrescription, Event} +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +class DrugPrescriptionTransformerSuite extends SharedContext { + + "transform" should "combine Drugs that has the same groupID to form a DrugPrescription" in { + // Given + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + //Given + + val input: Dataset[Event[Drug]] = Seq( + Drug("patient", "CITALOPRAM", 2,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2014, 1, 8)), + Drug("patient", "ZOLPIDEM", 2,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2014, 1, 8)), + Drug("patient", "CITALOPRAM", 2,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2014, 3, 12)), + Drug("patient", "ZOLPIDEM", 2,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2014, 3, 12)), + Drug("patient", "TIAPRIDE", 2,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2014, 3, 12)) + ).toDS + + val transformer = new DrugPrescriptionTransformer() + + val expected: Dataset[Event[DrugPrescription]] = Seq[Event[DrugPrescription]]( + DrugPrescription("patient", "CITALOPRAM_ZOLPIDEM", 2,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2014, 1, 8)), + DrugPrescription("patient", "CITALOPRAM_TIAPRIDE_ZOLPIDEM", 2,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2014, 3, 12)) + ).toDS + + val result = transformer.transform(input) + + assertDSs(expected.as[Event[Drug]], result.as[Event[Drug]]) + + } + + "fromDrugs" should "combine Drugs to form a DrugPrescription"in { + //Given + val input: List[Event[Drug]] = List( + Drug("patient", "CITALOPRAM", 2,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2014, 1, 8)), + Drug("patient", "ZOLPIDEM", 2,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2014, 1, 8)) + ) + + val transformer = new DrugPrescriptionTransformer() + + val expected: Event[DrugPrescription] = + DrugPrescription("patient", "CITALOPRAM_ZOLPIDEM", 2,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2014, 1, 8)) + + + val result = transformer.fromDrugs(input) + + assertResult(expected)(result) + + } + +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/exposures/ExposurePeriodAdderSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/exposures/ExposurePeriodAdderSuite.scala index 883cfcc8..2ceb4f89 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/exposures/ExposurePeriodAdderSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/exposures/ExposurePeriodAdderSuite.scala @@ -8,7 +8,7 @@ import fr.polytechnique.cmap.cnam.SharedContext import fr.polytechnique.cmap.cnam.etl.events.{Drug, Event, Exposure, FollowUp} import fr.polytechnique.cmap.cnam.util.functions.makeTS -class ExposurePeriodAdderSuite extends SharedContext{ +class ExposurePeriodAdderSuite extends SharedContext { "toExposure" should "transform drugs to exposure based on the limited adder strategy" in { // Given val sqlCtx = sqlContext @@ -17,11 +17,11 @@ class ExposurePeriodAdderSuite extends SharedContext{ //Given val input: Dataset[Event[Drug]] = Seq( - Drug("patient", "Antidepresseurs", 2, makeTS(2014, 1, 8)), - Drug("patient", "Antidepresseurs", 2, makeTS(2014, 2, 5)), - Drug("patient", "Antidepresseurs", 2, makeTS(2014, 3, 12)), - Drug("patient", "Antidepresseurs", 2, makeTS(2014, 4, 20)), - Drug("patient", "Antidepresseurs", 2, makeTS(2014, 6, 3)) + Drug("patient", "Antidepresseurs", 2,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2014, 1, 8)), + Drug("patient", "Antidepresseurs", 2,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2014, 2, 5)), + Drug("patient", "Antidepresseurs", 2,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2014, 3, 12)), + Drug("patient", "Antidepresseurs", 2,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2014, 4, 20)), + Drug("patient", "Antidepresseurs", 2,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2014, 6, 3)) ).toDS val followUp: Dataset[Event[FollowUp]] = Seq( FollowUp("patient", "any_reason", makeTS(2006, 6, 1), makeTS(2020, 12, 31)), @@ -35,6 +35,7 @@ class ExposurePeriodAdderSuite extends SharedContext{ val exposureAdder = LimitedExposureAdder(0.days, 15.days, 90.days, 30.days, PurchaseCountBased) val result = exposureAdder.toExposure(followUp)(input) + assertDSs(result, expected) } @@ -46,13 +47,13 @@ class ExposurePeriodAdderSuite extends SharedContext{ //Given val input: Dataset[Event[Drug]] = Seq[Event[Drug]]( - Drug("Patient_A", "PIOGLITAZONE", 1, makeTS(2008, 1, 1)), - Drug("Patient_A", "PIOGLITAZONE", 1, makeTS(2008, 2, 1)), - Drug("Patient_A", "PIOGLITAZONE", 1, makeTS(2008, 9, 1)), - Drug("Patient_A", "SULFONYLUREA", 1, makeTS(2009, 3, 1)), - Drug("Patient_A", "SULFONYLUREA", 1, makeTS(2009, 10, 1)), - Drug("Patient_B", "PIOGLITAZONE", 1, makeTS(2009, 1, 1)), - Drug("Patient_B", "BENFLUOREX", 1, makeTS(2007, 1, 1)) + Drug("Patient_A", "PIOGLITAZONE", 1,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2008, 1, 1)), + Drug("Patient_A", "PIOGLITAZONE", 1,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2008, 2, 1)), + Drug("Patient_A", "PIOGLITAZONE", 1,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2008, 9, 1)), + Drug("Patient_A", "SULFONYLUREA", 1,"MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMyMDBfMTc1OQ==", makeTS(2009, 3, 1)), + Drug("Patient_A", "SULFONYLUREA", 1,"MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMyMDBfMTc1OQ==", makeTS(2009, 10, 1)), + Drug("Patient_B", "PIOGLITAZONE", 1,"MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzM0MDBfMTk0OQ==", makeTS(2009, 1, 1)), + Drug("Patient_B", "BENFLUOREX", 1,"MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzM1MDBfMTczMw==", makeTS(2007, 1, 1)) ).toDS val followUp: Dataset[Event[FollowUp]] = Seq[Event[FollowUp]]( @@ -62,44 +63,48 @@ class ExposurePeriodAdderSuite extends SharedContext{ ).toDS() val expected: Dataset[Event[Exposure]] = Seq[Event[Exposure]]( - Exposure("Patient_A", "NA", "PIOGLITAZONE", 1, makeTS(2008, 2, 1), Some(makeTS(2008, 11, 30))) + Exposure("Patient_A", "MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", "PIOGLITAZONE", 1, makeTS(2008, 2, 1), Some(makeTS(2008, 11, 30))) ).toDS() val exposureAdder = UnlimitedExposureAdder(3.months, 2, 6.months) val result = exposureAdder.toExposure(followUp)(input) + assertDSs(result, expected) } - it should "transform drugs to exposure based on the unlimited adder strategy with different parameters" in { - // Given - val sqlCtx = sqlContext - import sqlCtx.implicits._ + it should "transform drugs to exposure based on the unlimited adder strategy with different parameters" in { + // Given + val sqlCtx = sqlContext + import sqlCtx.implicits._ - //Given + //Given - val input: Dataset[Event[Drug]] = Seq[Event[Drug]]( - Drug("Patient_A", "PIOGLITAZONE", 1, makeTS(2008, 1, 1)), - Drug("Patient_A", "PIOGLITAZONE", 1, makeTS(2008, 2, 1)), - Drug("Patient_A", "PIOGLITAZONE", 1, makeTS(2008, 9, 1)), - Drug("Patient_A", "SULFONYLUREA", 1, makeTS(2009, 3, 1)), - Drug("Patient_A", "SULFONYLUREA", 1, makeTS(2009, 10, 1)), - Drug("Patient_B", "PIOGLITAZONE", 1, makeTS(2009, 1, 1)), - Drug("Patient_B", "BENFLUOREX", 1, makeTS(2007, 1, 1)) - ).toDS + val input: Dataset[Event[Drug]] = Seq[Event[Drug]]( + Drug("Patient_A", "PIOGLITAZONE", 1,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2008, 1, 1)), + Drug("Patient_A", "PIOGLITAZONE", 1,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2008, 2, 1)), + Drug("Patient_A", "PIOGLITAZONE", 1,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2008, 9, 1)), + Drug("Patient_A", "SULFONYLUREA", 1,"MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMyMDBfMTc1OQ==", makeTS(2009, 3, 1)), + Drug("Patient_A", "SULFONYLUREA", 1,"MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMyMDBfMTc1OQ==", makeTS(2009, 10, 1)), + Drug("Patient_B", "PIOGLITAZONE", 1,"MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzM0MDBfMTk0OQ==", makeTS(2009, 1, 1)), + Drug("Patient_B", "BENFLUOREX", 1,"MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzM1MDBfMTczMw==", makeTS(2007, 1, 1)) + ).toDS - val followUp: Dataset[Event[FollowUp]] = Seq[Event[FollowUp]]( - FollowUp("Patient_A", "any_reason", makeTS(2006, 6, 1), makeTS(2008, 11, 30)), - FollowUp("Patient_B", "any_reason", makeTS(2006, 7, 1), makeTS(2007, 7, 1)), - FollowUp("Patient_C", "any_reason", makeTS(2006, 8, 1), makeTS(2009, 12, 31)) - ).toDS() - val expected: Dataset[Event[Exposure]] = Seq[Event[Exposure]]( - Exposure("Patient_A", "NA", "PIOGLITAZONE", 1, makeTS(2008, 1, 1), Some(makeTS(2008, 11, 30))), - Exposure("Patient_B", "NA", "BENFLUOREX", 1, makeTS(2007, 1, 1), Some(makeTS(2007, 7, 1))) - ).toDS() - val exposureAdder = UnlimitedExposureAdder(0.months, 1, 0.months) + val followUp: Dataset[Event[FollowUp]] = Seq[Event[FollowUp]]( + FollowUp("Patient_A", "any_reason", makeTS(2006, 6, 1), makeTS(2008, 11, 30)), + FollowUp("Patient_B", "any_reason", makeTS(2006, 7, 1), makeTS(2007, 7, 1)), + FollowUp("Patient_C", "any_reason", makeTS(2006, 8, 1), makeTS(2009, 12, 31)) + ).toDS() + + val expected: Dataset[Event[Exposure]] = Seq[Event[Exposure]]( + Exposure("Patient_A", "MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", "PIOGLITAZONE", 1, makeTS(2008, 1, 1), Some(makeTS(2008, 11, 30))), + Exposure("Patient_B", "MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzM1MDBfMTczMw==", "BENFLUOREX", 1, makeTS(2007, 1, 1), Some(makeTS(2007, 7, 1))) + ).toDS() + val exposureAdder = UnlimitedExposureAdder(0.months, 1, 0.months) + + val result = exposureAdder.toExposure(followUp)(input) + + assertDSs(result, expected) + } - val result = exposureAdder.toExposure(followUp)(input) - assertDSs(result, expected) - } } diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/exposures/ExposureTransformerSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/exposures/ExposureTransformerSuite.scala index 817bb3fa..c99a8d46 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/exposures/ExposureTransformerSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/exposures/ExposureTransformerSuite.scala @@ -9,6 +9,7 @@ import fr.polytechnique.cmap.cnam.etl.events.{Drug, Event, Exposure, FollowUp} import fr.polytechnique.cmap.cnam.util.functions.makeTS class ExposureTransformerSuite extends SharedContext { + "toExposure" should "transform drugs to exposure based on parameters" in { // Given val sqlCtx = sqlContext @@ -16,10 +17,10 @@ class ExposureTransformerSuite extends SharedContext { //Given val input: Dataset[Event[Drug]] = Seq( - Drug("Patient_A", "Antidepresseurs", 2, makeTS(2014, 6, 8)), - Drug("Patient_A", "Antidepresseurs", 2, makeTS(2014, 7, 1)), - Drug("Patient_B", "Antidepresseurs", 2, makeTS(2014, 2, 5)), - Drug("Patient_B", "Antidepresseurs", 2, makeTS(2014, 9, 1)) + Drug("Patient_A", "Antidepresseurs", 2,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2014, 6, 8)), + Drug("Patient_A", "Antidepresseurs", 2,"MjAxNC0wOS0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzMxMDBfMTc0OQ==", makeTS(2014, 7, 1)), + Drug("Patient_B", "Antidepresseurs", 2,"MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzM0MDBfMTk0OQ==", makeTS(2014, 2, 5)), + Drug("Patient_B", "Antidepresseurs", 2,"MjAxNC0wOC0wMV8yMDE0LTA3LTE3XzFfMTdfMF8wMUM2NzM0MDBfMTk0OQ==", makeTS(2014, 9, 1)) ).toDS val followUp: Dataset[Event[FollowUp]] = Seq( FollowUp("Patient_A", "any_reason", makeTS(2014, 6, 1), makeTS(2016, 12, 31)), @@ -44,6 +45,7 @@ class ExposureTransformerSuite extends SharedContext { ) val result = exposureTransformer.transform(followUp)(input) + assertDSs(expected, result) } diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/follow_up/FollowUpTransformerSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/follow_up/FollowUpTransformerSuite.scala index b34ebfbb..f34c8632 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/follow_up/FollowUpTransformerSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/follow_up/FollowUpTransformerSuite.scala @@ -2,14 +2,12 @@ package fr.polytechnique.cmap.cnam.etl.transformers.follow_up -import java.sql.Timestamp import scala.util.Try -import org.mockito.Mockito.mock import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.SharedContext import fr.polytechnique.cmap.cnam.etl.events._ import fr.polytechnique.cmap.cnam.etl.patients.Patient -import fr.polytechnique.cmap.cnam.etl.transformers.follow_up.FollowUpTransformerUtilities.{DeathReason, DiseaseReason, FollowUpEnd, ObservationEndReason, PatientDates, TrackLossDate, TrackLossReason, endReason, tracklossDateCorrected} +import fr.polytechnique.cmap.cnam.etl.transformers.follow_up.FollowUpTransformerUtilities.{DeathReason, FollowUpEnd, ObservationEndReason, PatientDates, TrackLossDate, TrackLossReason, endReason, tracklossDateCorrected} import fr.polytechnique.cmap.cnam.util.functions.makeTS @@ -160,25 +158,17 @@ class FollowUpTransformerSuite extends SharedContext { import sqlCtx.implicits._ // Given - val input: Dataset[(PatientDates, TrackLossDate, Event[Outcome])] = Seq( - (PatientDates("Patient_A", Some(makeTS(2015, 2, 1)), Some(makeTS(2006, 2, 1)), Some(makeTS(2009, 6, 30))), - TrackLossDate("Patient_A", Some(makeTS(2014, 5, 1))), Outcome( - "Patient_A", - "bladder_cancer", - makeTS(2007, 9, 1) - )) + val input: Dataset[(PatientDates, TrackLossDate)] = Seq( + (PatientDates("Patient_A", Some(makeTS(2015, 2, 1)), Some(makeTS(2006, 2, 1)), Some(makeTS(2014, 5, 1))), + TrackLossDate("Patient_A", Some(makeTS(2014, 5, 1)))) , (PatientDates("Patient_B", Some(makeTS(2016, 2, 1)), Some(makeTS(2006, 1, 1)), Some(makeTS(2012, 6, 30))), - TrackLossDate("Patient_B", Some(makeTS(2010, 2, 1))), Outcome( - "Patient_B", - "bladder_cancer", - makeTS(2011, 4, 1) - )), - (PatientDates("Patient_C", Some(makeTS(2017, 2, 1)), Some(makeTS(2006, 8, 1)), None), - TrackLossDate("Patient_C", None), mock(classOf[Event[Outcome]])), + TrackLossDate("Patient_B", Some(makeTS(2010, 2, 1)))), + (PatientDates("Patient_C", Some(makeTS(2017, 2, 1)), Some(makeTS(2006, 8, 1)), Some(makeTS(2017, 2, 1))), + TrackLossDate("Patient_C", None)), (PatientDates("Patient_D", Some(makeTS(2018, 2, 1)), Some(makeTS(2007, 10, 1)), Some(makeTS(2013, 6, 30))), - TrackLossDate("Patient_D", Some(makeTS(2017, 9, 1))), Outcome("Patient_D", "cancer", makeTS(2013, 6, 30))) + TrackLossDate("Patient_D", Some(makeTS(2017, 9, 1)))) ).toDS // When @@ -186,18 +176,16 @@ class FollowUpTransformerSuite extends SharedContext { .map { e => endReason( DeathReason(date = e._1.deathDate), - DiseaseReason(date = Try(Option(e._3.start)).getOrElse(None)), TrackLossReason(date = Try(e._2.trackloss).getOrElse(None)), ObservationEndReason(date = e._1.observationEnd) ) } - val expected: Dataset[FollowUpEnd] = Seq( - FollowUpEnd("Disease", Some(makeTS(2007, 9, 1))), + FollowUpEnd("Trackloss", Some(makeTS(2014, 5, 1))), FollowUpEnd("Trackloss", Some(makeTS(2010, 2, 1))), FollowUpEnd("Death", Some(makeTS(2017, 2, 1))), - FollowUpEnd("Disease", Some(makeTS(2013, 6, 30))) + FollowUpEnd("ObservationEnd", Some(makeTS(2013, 6, 30))) ).toDS @@ -307,7 +295,7 @@ class FollowUpTransformerSuite extends SharedContext { ).toDS val expected = Seq( - FollowUp("Regis", "Disease", makeTS(2006, 3, 1), makeTS(2007, 9, 1)), + FollowUp("Regis", "ObservationEnd", makeTS(2006, 3, 1), makeTS(2009, 1, 1)), FollowUp("pika", "Death", makeTS(2006, 3, 1), makeTS(2008, 10, 1)), FollowUp("patient03", "ObservationEnd", makeTS(2006, 3, 1), makeTS(2009, 1, 1)) ).toDS diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/interaction/NLevelInteractionTransformerSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/interaction/NLevelInteractionTransformerSuite.scala index 9d9ee9ad..44d87d96 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/interaction/NLevelInteractionTransformerSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/interaction/NLevelInteractionTransformerSuite.scala @@ -2,6 +2,7 @@ package fr.polytechnique.cmap.cnam.etl.transformers.interaction +import me.danielpes.spark.datetime.implicits._ import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.SharedContext import fr.polytechnique.cmap.cnam.etl.datatypes.Period @@ -31,8 +32,7 @@ class NLevelInteractionTransformerSuite extends SharedContext { ExposureN("Federer", Set("Dopamine", "Diazepam"), Period(makeTS(2019, 4, 1), makeTS(2019, 5, 1))), ExposureN("Federer", Set("Paracetamol", "Dopamine"), Period(makeTS(2019, 3, 1), makeTS(2019, 5, 1))), ExposureN("Federer", Set("Paracetamol", "Dopamine"), Period(makeTS(2019, 7, 1), makeTS(2019, 8, 1))), - ExposureN("Federer", Set("Paracetamol", "Diazepam"), Period(makeTS(2019, 4, 1), makeTS(2019, 6, 1))), - ExposureN("Federer", Set("Alprazolam", "Dopamine"), Period(makeTS(2019, 2, 1), makeTS(2019, 3, 1))) + ExposureN("Federer", Set("Paracetamol", "Diazepam"), Period(makeTS(2019, 4, 1), makeTS(2019, 6, 1))) ).toDS(), Seq[ExposureN]( ExposureN("Federer", Set("Paracetamol"), Period(makeTS(2019, 3, 1), makeTS(2019, 8, 1))), @@ -43,9 +43,10 @@ class NLevelInteractionTransformerSuite extends SharedContext { ).toDS() ) - val result = NLevelInteractionTransformer(InteractionTransformerConfig(3)).elevateToExposureN(exposures, 3) + val result = NLevelInteractionTransformer(InteractionTransformerConfig(3, 30.days)).elevateToExposureN(exposures, 3) // The mapping is necessary for now as Spark seems to struggle with nested Data Structures - result.zip(expected).foreach(e => assertDSs(e._1.map(_.toInteraction).distinct(), e._2.distinct().map(_.toInteraction).distinct())) + result.zip(expected) + .foreach(e => assertDSs(e._1.map(_.toInteraction).distinct(), e._2.distinct().map(_.toInteraction).distinct())) } @@ -101,7 +102,8 @@ class NLevelInteractionTransformerSuite extends SharedContext { val result = NLevelInteractionTransformer(InteractionTransformerConfig(3)).trickleDownExposureN(input) // The mapping is necessary for now as Spark seems to struggle with nested Data Structures - result.zip(expected).foreach(e => assertDSs(e._1.map(_.toInteraction).distinct(), e._2.distinct().map(_.toInteraction).distinct())) + result.zip(expected) + .foreach(e => assertDSs(e._1.map(_.toInteraction).distinct(), e._2.distinct().map(_.toInteraction).distinct())) } "reduceHigherExposuresNFromLowerExposures" should "reduce the time period of higher ExposureN from lower ExposureN" in { @@ -158,8 +160,17 @@ class NLevelInteractionTransformerSuite extends SharedContext { ).toDS() ) - val result = NLevelInteractionTransformer(InteractionTransformerConfig(3)).reduceHigherExposuresNFromLowerExposures(interactions, higherInteractionInvolvement) - result.zip(expected).foreach(e => assertDSs(e._1.map(_.e.toInteraction).distinct(), e._2.distinct().map(_.toInteraction).distinct())) + val result = NLevelInteractionTransformer(InteractionTransformerConfig(3)).reduceHigherExposuresNFromLowerExposures( + interactions, + higherInteractionInvolvement + ) + result.zip(expected) + .foreach( + e => assertDSs( + e._1.map(_.e.toInteraction).distinct(), + e._2.distinct().map(_.toInteraction).distinct() + ) + ) } "transform" should "create interactions of level N" in { @@ -176,14 +187,14 @@ class NLevelInteractionTransformerSuite extends SharedContext { ).toDS() val expected: Dataset[Event[Interaction]] = Seq[ExposureN]( - ExposureN("Federer", Set("Paracetamol", "Dopamine", "Diazepam"), Period(makeTS(2019, 4, 1), makeTS(2019, 5, 1))), - ExposureN("Federer", Set("Paracetamol", "Dopamine"), Period(makeTS(2019, 3, 1), makeTS(2019, 4, 1))), - ExposureN("Federer", Set("Paracetamol", "Dopamine"), Period(makeTS(2019, 7, 1), makeTS(2019, 8, 1))), - ExposureN("Federer", Set("Paracetamol", "Diazepam"), Period(makeTS(2019, 5, 1), makeTS(2019, 6, 1))), - ExposureN("Federer", Set("Alprazolam", "Dopamine"), Period(makeTS(2019, 2, 1), makeTS(2019, 3, 1))), - ExposureN("Federer", Set("Paracetamol"), Period(makeTS(2019, 6, 1), makeTS(2019, 7, 1))), - ExposureN("Federer", Set("Alprazolam"), Period(makeTS(2019, 1, 1), makeTS(2019, 2, 1))) - ).toDS.map[Event[Interaction]]((e: ExposureN) => e.toInteraction) + ExposureN("Federer", Set("Paracetamol", "Dopamine", "Diazepam"), Period(makeTS(2019, 4, 1), makeTS(2019, 5, 1))), + ExposureN("Federer", Set("Paracetamol", "Dopamine"), Period(makeTS(2019, 3, 1), makeTS(2019, 4, 1))), + ExposureN("Federer", Set("Paracetamol", "Dopamine"), Period(makeTS(2019, 7, 1), makeTS(2019, 8, 1))), + ExposureN("Federer", Set("Paracetamol", "Diazepam"), Period(makeTS(2019, 5, 1), makeTS(2019, 6, 1))), + ExposureN("Federer", Set("Dopamine"), Period(makeTS(2019, 2, 1), makeTS(2019, 3, 1))), + ExposureN("Federer", Set("Paracetamol"), Period(makeTS(2019, 6, 1), makeTS(2019, 7, 1))), + ExposureN("Federer", Set("Alprazolam"), Period(makeTS(2019, 1, 1), makeTS(2019, 3, 1))) + ).toDS.map[Event[Interaction]]((e: ExposureN) => e.toInteraction) val result = NLevelInteractionTransformer(InteractionTransformerConfig(6)).transform(exposures) diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/outcomes/OutcomeSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/outcomes/OutcomeSuite.scala index 376885b9..234e91a5 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/outcomes/OutcomeSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/outcomes/OutcomeSuite.scala @@ -36,41 +36,4 @@ class OutcomeSuite extends AnyFlatSpec { // Then assert(result == expected) } - - "fromRow" should "allow creation of a Outcome event from a row object" in { - // Given - val schema = StructType( - StructField("pID", StringType) :: - StructField("name", StringType) :: - StructField("date", TimestampType) :: Nil - ) - val values = Array[Any]("Patient01", "bladder_cancer", makeTS(2010, 1, 1)) - val r = new GenericRowWithSchema(values, schema) - val expected = Outcome("Patient01", "bladder_cancer", makeTS(2010, 1, 1)) - - // When - val result = Outcome.fromRow(r, "pID", "name", "date") - - // Then - assert(result == expected) - } - - "fromRow" should "have severity" in { - // Given - val schema = StructType( - StructField("pID", StringType) :: - StructField("name", StringType) :: - StructField("weight", DoubleType) :: - StructField("date", TimestampType) :: Nil - ) - val values = Array[Any]("Patient01", "bladder_cancer", 4.0, makeTS(2010, 1, 1)) - val r = new GenericRowWithSchema(values, schema) - val expected = Outcome("Patient01", "bladder_cancer", 4.0, makeTS(2010, 1, 1)) - - // When - val result = Outcome.fromRow(r, "pID", "name", "weight", "date") - - // Then - assert(result == expected) - } } diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/patients/PatientFiltersSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/patients/PatientFiltersSuite.scala new file mode 100644 index 00000000..4a384fed --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/patients/PatientFiltersSuite.scala @@ -0,0 +1,42 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.transformers.patients + +import org.apache.spark.sql.Dataset +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.extractors.patients.PatientsConfig +import fr.polytechnique.cmap.cnam.etl.patients.Patient +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +class PatientFiltersSuite extends SharedContext { + + "transform" should "return the correct data in a Dataset[Patient] for a known input" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ + + // Given + val config = PatientsConfig(ageReferenceDate = makeTS(2006, 1, 1)) + val input: Dataset[Patient] = Seq( + Patient("Patient_01", 1, makeTS(1945, 1, 1), None), + Patient("Patient_02", 1, makeTS(1956, 2, 1), Some(makeTS(2009, 3, 13))), + Patient("Patient_03", 2, makeTS(1937, 3, 1), Some(makeTS(1980, 4, 1))), + Patient("Patient_04", 2, makeTS(1966, 2, 1), Some(makeTS(2009, 3, 13))), + Patient("Patient_05", 1, makeTS(1935, 4, 1), Some(makeTS(2020, 3, 13))), + Patient("Patient_06", 3, makeTS(1920, 8, 1), Some(makeTS(1980, 8, 1))), + Patient("Patient_07", 3, makeTS(2000, 8, 1), Some(makeTS(1980, 8, 1))) + ).toDS() + + // When + val result = new PatientFilters(config).filterPatients(input) + val expected: Dataset[Patient] = Seq( + Patient("Patient_01", 1, makeTS(1945, 1, 1), None), + Patient("Patient_02", 1, makeTS(1956, 2, 1), Some(makeTS(2009, 3, 13))), + Patient("Patient_03", 2, makeTS(1937, 3, 1), Some(makeTS(1980, 4, 1))), + Patient("Patient_04", 2, makeTS(1966, 2, 1), Some(makeTS(2009, 3, 13))) + ).toDS() + + // Then + assertDSs(result, expected) + } + +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/tracklosses/TracklossTransformerSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/tracklosses/TracklossTransformerSuite.scala new file mode 100644 index 00000000..c2eb6cf4 --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/etl/transformers/tracklosses/TracklossTransformerSuite.scala @@ -0,0 +1,35 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.etl.transformers.tracklosses + +import org.apache.spark.sql.{DataFrame, Dataset} +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.events.{Event, Molecule, Trackloss} +import fr.polytechnique.cmap.cnam.util.functions + +class TracklossTransformerSuite extends SharedContext { + + "transform" should "return correct result" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ + // Given + val config = TracklossesConfig(functions.makeTS(2006, 12, 31)) + val drugs: Dataset[Event[Molecule]] = Seq( + Molecule("Patient_01", "3400935418487", 1.0, functions.makeTS(2006, 1, 15)), + Molecule("Patient_01", "3400935418487", 1.0, functions.makeTS(2006, 6, 30)), + Molecule("Patient_02", "3400935563538", 1.0, functions.makeTS(2006, 1, 5)), + Molecule("Patient_02", "3400935563538", 1.0, functions.makeTS(2006, 1, 15)), + Molecule("Patient_02", "3400935563538", 1.0, functions.makeTS(2006, 1, 30)), + Molecule("Patient_02", "3400935563538", 1.0, functions.makeTS(2006, 1, 30)) + ).toDS() + val expected: Dataset[Event[Trackloss]] = Seq( + Trackloss("Patient_01", functions.makeTS(2006, 3, 15)), + Trackloss("Patient_01", functions.makeTS(2006, 8, 30)), + Trackloss("Patient_02", functions.makeTS(2006, 3, 30)) + ).toDS() + //when + val res = new TracklossTransformer(config).transform(drugs) + //then + assertDSs(expected, res) + } +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/DcirSourceExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/DcirSourceExtractorSuite.scala new file mode 100644 index 00000000..9f81900f --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/DcirSourceExtractorSuite.scala @@ -0,0 +1,26 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.study.bulk.extractors + +import org.apache.spark.sql.DataFrame +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.DrugConfig +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level.Cip13Level +import fr.polytechnique.cmap.cnam.etl.sources.Sources + +class DcirSourceExtractorSuite extends SharedContext { + "extract" should "extract available Events and warns when it fails if the tables have not been flattened" in { + val sqlCtx = sqlContext + + // Given + val dcir: DataFrame = sqlCtx.read.load("src/test/resources/test-input/DCIR.parquet") + val source = new Sources(dcir = Some(dcir)) + val path = "target/test/output" + val drugConfig = new DrugConfig(Cip13Level, List.empty) + val dcirSource = new DcirSourceExtractor(path, "overwrite", drugConfig) + + dcirSource.extract(source) + + assert(true) + } +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/HadSourceExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/HadSourceExtractorSuite.scala new file mode 100644 index 00000000..5190b203 --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/HadSourceExtractorSuite.scala @@ -0,0 +1,22 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.study.bulk.extractors + +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.sources.Sources + +class HadSourceExtractorSuite extends SharedContext { + "extract" should "extract available Events and warns when it fails if the tables have not been flattened" in { + val sqlCtx = sqlContext + + // Given + val had = spark.read.parquet("src/test/resources/test-input/HAD.parquet") + val source = new Sources(had = Some(had)) + val path = "target/test/output" + val hadSource = new HadSourceExtractor(path, "overwrite") + // When + hadSource.extract(source) + // Then, make sure everything is running. + assert(true) + } +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/ImbSourceExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/ImbSourceExtractorSuite.scala new file mode 100644 index 00000000..53195285 --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/ImbSourceExtractorSuite.scala @@ -0,0 +1,22 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.study.bulk.extractors + +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.sources.Sources + +class ImbSourceExtractorSuite extends SharedContext { + "extract" should "extract available Events and warns when it fails if the tables have not been flattened" in { + val sqlCtx = sqlContext + + // Given + val imbR = spark.read.parquet("src/test/resources/test-input/IR_IMB_R.parquet") + val source = new Sources(had = Some(imbR)) + val path = "target/test/output" + val imbSource = new ImbSourceExtractor(path, "overwrite") + // When + imbSource.extract(source) + // Then, make sure everything is running. + assert(true) + } +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/McoCeSourceExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/McoCeSourceExtractorSuite.scala new file mode 100644 index 00000000..c415991a --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/McoCeSourceExtractorSuite.scala @@ -0,0 +1,22 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.study.bulk.extractors + +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.sources.Sources + +class McoCeSourceExtractorSuite extends SharedContext { + "extract" should "extract available Events and warns when it fails if the tables have not been flattened" in { + val sqlCtx = sqlContext + + // Given + val mcoce = spark.read.parquet("src/test/resources/test-input/MCO_CE.parquet") + val source = new Sources(mcoCe = Some(mcoce)) + val path = "target/test/output" + val mcoCeSource = new McoCeSourceExtractor(path, "overwrite") + // When + mcoCeSource.extract(source) + // Then, make sure everything is running. + assert(true) + } +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/McoSourceExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/McoSourceExtractorSuite.scala new file mode 100644 index 00000000..493716e8 --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/McoSourceExtractorSuite.scala @@ -0,0 +1,22 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.study.bulk.extractors + +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.sources.Sources + +class McoSourceExtractorSuite extends SharedContext { + "extract" should "extract available Events and warns when it fails if the tables have not been flattened" in { + val sqlCtx = sqlContext + + // Given + val mco = spark.read.parquet("src/test/resources/test-input/MCO.parquet") + val source = new Sources(mco = Some(mco)) + val path = "target/test/output" + val mcoSource = new McoSourceExtractor(path, "overwrite") + // When + mcoSource.extract(source) + // Then, make sure everything is running. + assert(true) + } +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/SourceExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/SourceExtractorSuite.scala new file mode 100644 index 00000000..80ff9aa9 --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/SourceExtractorSuite.scala @@ -0,0 +1,87 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.study.bulk.extractors + + +import scala.reflect.runtime.universe +import org.apache.spark.sql.{DataFrame, Dataset, Row, SQLContext} +import fr.polytechnique.cmap.cnam.etl.extractors.Extractor +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.events.{Event, McoCIM10Act, MedicalAct} +import fr.polytechnique.cmap.cnam.etl.extractors.codes.{ExtractorCodes, SimpleExtractorCodes} +import fr.polytechnique.cmap.cnam.etl.sources.Sources +import fr.polytechnique.cmap.cnam.util.functions.makeTS +import fr.polytechnique.cmap.cnam.util.reporting.{OperationMetadata, OperationTypes} +import fr.polytechnique.cmap.cnam.util.Path + +class SourceExtractorSuite extends SharedContext { + lazy val sqlCtx: SQLContext = super.sqlContext + + // This shouldn't be replicated anywhere and Mocking should be the preferred technique. + // Mocking the Extractor is not possible because of type erasure. Type erasure make the typing of the implicit + // in the extractor method of trait Extractor as the type is not known at compile time, but only at run time. + val testExtractor = new Extractor[MedicalAct, SimpleExtractorCodes] { + override def getCodes: SimpleExtractorCodes = SimpleExtractorCodes.empty + + override def isInStudy(row: Row): Boolean = true + + override def isInExtractorScope(row: Row): Boolean = true + + override def builder(row: Row): Seq[Event[MedicalAct]] = + Seq(McoCIM10Act("Patient_02", "10000123_10000987_2006", "C670", makeTS(2005, 12, 31))) + + override def getInput(sources: Sources): DataFrame = { + import sqlCtx.implicits._ + Seq[Event[MedicalAct]]( + McoCIM10Act("Patient_02", "10000123_10000987_2006", "C670", makeTS(2005, 12, 31)), + McoCIM10Act("Patient_02", "10000123_20000123_2007", "C670", makeTS(2007, 1, 31)), + McoCIM10Act("Patient_02", "10000123_30000546_2008", "C670", makeTS(2008, 3, 10)) + ).toDF + } + + override def extract( + sources: Sources) + (implicit ctag: universe.TypeTag[MedicalAct]): Dataset[Event[MedicalAct]] = { + import sqlCtx.implicits._ + Seq[Event[MedicalAct]]( + McoCIM10Act("Patient_02", "10000123_10000987_2006", "C670", makeTS(2005, 12, 31)), + McoCIM10Act("Patient_02", "10000123_20000123_2007", "C670", makeTS(2007, 1, 31)), + McoCIM10Act("Patient_02", "10000123_30000546_2008", "C670", makeTS(2008, 3, 10)) + ).toDS + } + } + + "extract" should "produce run and report the Extractors" in { + import sqlCtx.implicits._ + + // Given + + val sources = Sources() + val ds = Seq[Event[MedicalAct]]( + McoCIM10Act("Patient_02", "10000123_10000987_2006", "C670", makeTS(2005, 12, 31)), + McoCIM10Act("Patient_02", "10000123_20000123_2007", "C670", makeTS(2007, 1, 31)), + McoCIM10Act("Patient_02", "10000123_30000546_2008", "C670", makeTS(2008, 3, 10)) + ).toDS + val path = "target/test/output" + + val expected = List( + OperationMetadata( + "Mock", + List("Mock"), + OperationTypes.AnyEvents, + Path(path, "Mock", "data").toString, + Path(path, "Mock", "patients").toString + ) + ) + // When + val se: SourceExtractor = new SourceExtractor(path, "overwrite") { + override val sourceName: String = "Test" + override val extractors: List[ExtractorSources[MedicalAct, ExtractorCodes]] = + List(ExtractorSources[MedicalAct, SimpleExtractorCodes](testExtractor, List("Mock"), "Mock")) + } + + val result = se.extract(sources) + // Then + assert(result == expected) + } +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/SsrCeSourceExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/SsrCeSourceExtractorSuite.scala new file mode 100644 index 00000000..df3c8970 --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/SsrCeSourceExtractorSuite.scala @@ -0,0 +1,33 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.study.bulk.extractors + +import fr.polytechnique.cmap.cnam.etl.sources.Sources +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.extractors.sources.ssrce.SsrCeSource +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +class SsrCeSourceExtractorSuite extends SharedContext { + "extract" should "extract available Events and warns when it fails if the tables have not been flattened" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ + val colNames = new SsrCeSource {}.ColNames + // Given + val ssrCe = Seq( + ("Patient_A", "AAAA", makeTS(2010, 1, 1)), + ("Patient_A", "BBBB", makeTS(2010, 2, 1)), + ("Patient_B", "CCCC", makeTS(2010, 3, 1)), + ("Patient_B", "CCCC", makeTS(2010, 4, 1)), + ("Patient_C", "BBBB", makeTS(2010, 5, 1)) + ).toDF( + colNames.PatientID, colNames.CamCode, colNames.StartDate + ) + val source = new Sources(ssrCe = Some(ssrCe)) + val path = "target/test/output" + val ssrCeSource = new SsrCeSourceExtractor(path, "overwrite") + // When + ssrCeSource.extract(source) + // Then, make sure everything is running. + assert(true) + } +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/SsrSourceExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/SsrSourceExtractorSuite.scala new file mode 100644 index 00000000..cb812e33 --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/study/bulk/extractors/SsrSourceExtractorSuite.scala @@ -0,0 +1,22 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.study.bulk.extractors + +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.sources.Sources + +class SsrSourceExtractorSuite extends SharedContext { + "extract" should "extract available Events and warns when it fails if the tables have not been flattened" in { + val sqlCtx = sqlContext + + // Given + val ssr = spark.read.parquet("src/test/resources/test-input/SSR.parquet") + val source = new Sources(ssr = Some(ssr)) + val path = "target/test/output" + val ssrSource = new SsrSourceExtractor(path, "overwrite") + // When + ssrSource.extract(source) + // Then, make sure everything is running. + assert(true) + } +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/FallMainExtractorTransformSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/FallMainExtractorTransformSuite.scala index 7a2c5064..a5ef1ae7 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/FallMainExtractorTransformSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/FallMainExtractorTransformSuite.scala @@ -2,16 +2,17 @@ package fr.polytechnique.cmap.cnam.study.fall +import org.apache.spark.sql.{Encoders, SparkSession} import fr.polytechnique.cmap.cnam.SharedContext import fr.polytechnique.cmap.cnam.etl.events._ -import fr.polytechnique.cmap.cnam.etl.extractors.patients.{Patients, PatientsConfig} +import fr.polytechnique.cmap.cnam.etl.extractors.patients.{AllPatientExtractor, PatientsConfig} import fr.polytechnique.cmap.cnam.etl.implicits import fr.polytechnique.cmap.cnam.etl.patients.Patient import fr.polytechnique.cmap.cnam.etl.sources.Sources +import fr.polytechnique.cmap.cnam.etl.transformers.patients.PatientFilters import fr.polytechnique.cmap.cnam.study.fall.config.FallConfig import fr.polytechnique.cmap.cnam.study.fall.extractors._ import fr.polytechnique.cmap.cnam.util.reporting._ -import org.apache.spark.sql.{Encoders, SparkSession} class FallMainExtractorTransformSuite extends SharedContext { @@ -33,11 +34,11 @@ class FallMainExtractorTransformSuite extends SharedContext { assertDSs(new DiagnosisExtractor(fallConfig.diagnoses).extract(sources), spark.read.parquet(meta.get("diagnoses").get.outputPath) .as(Encoders.bean(classOf[Event[Diagnosis]]))) - assertDSs(new ActsExtractor(fallConfig.medicalActs).extract(sources), + assertDSs(new ActsExtractor(fallConfig.medicalActs).extract(sources)._1, spark.read.parquet(meta.get("acts").get.outputPath) .as(Encoders.bean(classOf[Event[MedicalAct]]))) - assertDSs(new Patients(PatientsConfig(fallConfig.base.studyStart)).extract(sources), - spark.read.parquet(meta.get("extract_patients").get.outputPath) + assertDSs(new PatientFilters(PatientsConfig(fallConfig.base.studyStart)).filterPatients(AllPatientExtractor.extract(sources)), + spark.read.parquet(meta.get("extract_filtered_patients").get.outputPath) .as(Encoders.bean(classOf[Patient]))) } } diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/FallMainSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/FallMainSuite.scala index 80aec387..f14b82df 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/FallMainSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/FallMainSuite.scala @@ -6,7 +6,6 @@ import fr.polytechnique.cmap.cnam.SharedContext import fr.polytechnique.cmap.cnam.etl.implicits import fr.polytechnique.cmap.cnam.etl.sources.Sources import fr.polytechnique.cmap.cnam.study.fall.config.FallConfig -import org.apache.spark.sql.functions.lit class FallMainSuite extends SharedContext { @@ -45,8 +44,9 @@ class FallMainSuite extends SharedContext { import implicits.SourceReader val sources = Sources.sanitize(sqlContext.readSources(fallConfig.input)) val expectedOutputPaths = List( - "target/test/output/drug_purchases/data", "target/test/output/extract_patients/data", - "target/test/output/filter_patients/data", "target/test/output/exposures/data" + "target/test/output/drug_purchases/data", "target/test/output/extract_raw_patients/data", + "target/test/output/extract_filtered_patients/data", "target/test/output/filter_patients/data", + "target/test/output/exposures/data" ) val expectedOutputTypes = List("dispensations", "patients", "exposures") diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/config/FallConfigSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/config/FallConfigSuite.scala index 4f22ac99..ba2ff1c8 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/config/FallConfigSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/config/FallConfigSuite.scala @@ -10,7 +10,7 @@ import me.danielpes.spark.datetime.implicits._ import org.scalatest.flatspec.AnyFlatSpec import fr.polytechnique.cmap.cnam.etl.config.BaseConfig import fr.polytechnique.cmap.cnam.etl.config.study.StudyConfig.{InputPaths, OutputPaths} -import fr.polytechnique.cmap.cnam.etl.extractors.drugs.level.PharmacologicalLevel +import fr.polytechnique.cmap.cnam.etl.extractors.events.drugs.level.PharmacologicalLevel import fr.polytechnique.cmap.cnam.etl.transformers.exposures.{LatestPurchaseBased, LimitedExposureAdder} class FallConfigSuite extends AnyFlatSpec { @@ -65,11 +65,12 @@ class FallConfigSuite extends AnyFlatSpec { | to_exposure_strategy = "latest_purchase_based" | } | } - | interaction { + | interactions { | level: 5 + | minimum_duration: 50 days | } | patients { - | start_gap_in_months: 2 + | start_gap_in_months: 2 | } | drugs { | level: "Pharmacological" @@ -79,7 +80,7 @@ class FallConfigSuite extends AnyFlatSpec { | sites: ["BodySites"] | } | run_parameters { - | outcome: ["Acts", "Diagnoses", "Outcomes"] // pipeline of calculation of outcome, possible values : Acts, Diagnoses, and Outcomes + | outcome: ["Acts", "Diagnoses", "HospitalDeaths", "Outcomes"] // pipeline of calculation of outcome, possible values : Acts, Diagnoses, and Outcomes | exposure: ["Patients", "DrugPurchases", "Exposures"] // pipeline of the calculation of exposure, possible values : Patients, StartGapPatients, DrugPurchases, Exposures | } | """.trim.stripMargin @@ -97,7 +98,11 @@ class FallConfigSuite extends AnyFlatSpec { endThresholdGc = 900.days, toExposureStrategy = LatestPurchaseBased ) - ), drugs = defaultConf.drugs.copy( + ), interactions = defaultConf.interactions.copy( + 5, + 50.days + ), + drugs = defaultConf.drugs.copy( level = PharmacologicalLevel ), runParameters = defaultConf.runParameters.copy(exposure = List("Patients", "DrugPurchases", "Exposures")) ) diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/FallHospitalStayExtractorSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/FallHospitalStayExtractorSuite.scala new file mode 100644 index 00000000..89e769ab --- /dev/null +++ b/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/extractors/FallHospitalStayExtractorSuite.scala @@ -0,0 +1,43 @@ +// License: BSD 3 clause + +package fr.polytechnique.cmap.cnam.study.fall.extractors + +import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema +import org.apache.spark.sql.types.{DateType, StructField, StructType} +import fr.polytechnique.cmap.cnam.SharedContext +import fr.polytechnique.cmap.cnam.etl.extractors.codes.SimpleExtractorCodes +import fr.polytechnique.cmap.cnam.etl.extractors.sources.mco.McoSource +import fr.polytechnique.cmap.cnam.util.functions.makeTS + +class FallHospitalStayExtractorSuite extends SharedContext { + val colNames = new McoSource {}.ColNames + val newColNames = new McoSource {}.NewColumns + + "extractEnd" should "return the end date from end date column" in { + // Given + val schema = StructType( + StructField(colNames.EndDate, DateType) :: + StructField(newColNames.EstimatedStayStart, DateType) :: Nil + ) + val array = Array[Any](makeTS(2020, 1, 3), makeTS(2020, 1, 1)) + val input = new GenericRowWithSchema(array, schema) + val expected = makeTS(2020, 1, 3) + //When + val result = new FallHospitalStayExtractor(SimpleExtractorCodes.empty).extractEnd(input) + assert(result.get == expected) + } + + it should "fall back on the start date when the end date column is null" in { + // Given + val schema = StructType( + StructField(colNames.EndDate, DateType) :: + StructField(newColNames.EstimatedStayStart, DateType) :: Nil + ) + val array = Array[Any](null, makeTS(2020, 1, 1)) + val input = new GenericRowWithSchema(array, schema) + val expected = makeTS(2020, 1, 1) + //When + val result = new FallHospitalStayExtractor(SimpleExtractorCodes.empty).extractEnd(input) + assert(result.get == expected, true) + } +} diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/FracturesTransformerSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/FracturesTransformerSuite.scala index 1c783247..449cf5a9 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/FracturesTransformerSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/FracturesTransformerSuite.scala @@ -9,6 +9,7 @@ import fr.polytechnique.cmap.cnam.etl.events._ import fr.polytechnique.cmap.cnam.study.fall.FallMain.CCAMExceptions import fr.polytechnique.cmap.cnam.study.fall.config.FallConfig import fr.polytechnique.cmap.cnam.study.fall.config.FallConfig.FracturesConfig +import fr.polytechnique.cmap.cnam.study.fall.extractors.Death import fr.polytechnique.cmap.cnam.util.functions.makeTS @@ -24,10 +25,10 @@ class FracturesTransformerSuite extends SharedContext { val testConf = defaultConf.copy(outcomes = FracturesConfig(fallFrame = 3.months)) val acts: Dataset[Event[MedicalAct]] = Seq( //pubic ambulatory acts - McoCEAct("georgette", DcirAct.groupID.PublicAmbulatory, "MZMP007", 1.0, makeTS(2010, 2, 6)), - McoCEAct("georgettebis", DcirAct.groupID.PublicAmbulatory, "MZMP007", 1.0, makeTS(2010, 2, 6)), - McoCEAct("george", DcirAct.groupID.PublicAmbulatory, "whatever", 1.0, makeTS(2010, 2, 6)), - DcirAct("john", DcirAct.groupID.PublicAmbulatory, "MZMP007", 1.0, makeTS(2010, 2, 6)), + McoCeCcamAct("georgette", McoCeCcamAct.category, "MZMP007", 1.0, makeTS(2010, 2, 6)), + McoCeCcamAct("georgettebis", McoCeCcamAct.category, "MZMP007", 1.0, makeTS(2010, 2, 6)), + McoCeCcamAct("george", McoCeCcamAct.category, "whatever", 1.0, makeTS(2010, 2, 6)), + DcirAct("john", McoCeCcamAct.category, "MZMP007", 1.0, makeTS(2010, 2, 6)), //private ambulatory acts DcirAct("riri", DcirAct.groupID.PrivateAmbulatory, "NBEP002", 1.0, makeTS(2007, 1, 1)), DcirAct("fifi", DcirAct.groupID.PrivateAmbulatory, "stupidcode", 1.0, makeTS(2007, 1, 1)), @@ -47,18 +48,26 @@ class FracturesTransformerSuite extends SharedContext { McoMainDiagnosis("emile", "3", "S222", 2.0, makeTS(2017, 7, 18)), McoMainDiagnosis("emile", "3", "S222", 3.0, makeTS(2017, 7, 18)), McoMainDiagnosis("emile", "3", "S222", 4.0, makeTS(2017, 7, 18)), - McoMainDiagnosis("kevin", "BassinRachis", "S327", 3.0, makeTS(2017, 7, 18)), + McoMainDiagnosis("kevin", "1", "S327", 3.0, makeTS(2017, 7, 18)), McoMainDiagnosis("jean", "4", "S120", 4.0, makeTS(2017, 7, 18)), McoMainDiagnosis("Paul", "1", "S42.54678", makeTS(2017, 7, 20)), - McoMainDiagnosis("Paul", "7", "hemorroides", makeTS(2017, 1, 2)), + McoMainDiagnosis("Paul", "7", "S42.54678", makeTS(2017, 1, 2)), McoAssociatedDiagnosis("Jacques", "8", "qu'est-ce-que tu fais là?", makeTS(2017, 7, 18)) ).toDS + val surgeries: Dataset[Event[MedicalAct]] = Seq[Event[MedicalAct]]( + McoCCAMAct("kevin", "1", "NHDA007", makeTS(2017, 7, 18)) + ).toDS() + + val hospitalDeaths: Dataset[Event[HospitalStay]] = Seq[Event[HospitalStay]]( + HospitalStay("emile", "3", Death.value, 0D, makeTS(2017, 7, 18), Some(makeTS(2017, 7, 18))) + ).toDS() + val expectedOutcomes = Seq( //hospitalization Outcome("emile", "Ribs", "hospitalized_fall", 4.0, makeTS(2017, 7, 18)), Outcome("kevin", "BassinRachis", "hospitalized_fall", 3.0, makeTS(2017, 7, 18)), - Outcome("jean", "Rachis", "hospitalized_fall", 4.0, makeTS(2017, 7, 18)), + Outcome("jean", "Rachis", "hospitalized_fall", 2.0, makeTS(2017, 7, 18)), //private ambulatory Outcome("riri", "FemurExclusionCol", PrivateAmbulatoryFractures.outcomeName, 1.0, makeTS(2007, 1, 1)), //public ambulatory @@ -69,11 +78,10 @@ class FracturesTransformerSuite extends SharedContext { Outcome("Ben", "MembreSuperieurDistal", "Liberal", 1.0, makeTS(2017, 7, 18)), Outcome("Beni", "MembreSuperieurDistal", "Liberal", 1.0, makeTS(2017, 7, 18)), Outcome("Sam", "CraneFace", "Liberal", 1.0, makeTS(2015, 7, 18)) - ).toDS //When - val result = new FracturesTransformer(testConf).transform(liberalActs, acts, diagnoses) + val result = new FracturesTransformer(testConf).transform(liberalActs, acts, diagnoses, surgeries, hospitalDeaths) //Then assertDSs(result, expectedOutcomes) diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/HospitalizedFracturesSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/HospitalizedFracturesSuite.scala index 7fa07622..7082e058 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/HospitalizedFracturesSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/HospitalizedFracturesSuite.scala @@ -2,170 +2,212 @@ package fr.polytechnique.cmap.cnam.study.fall.fractures +import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.SharedContext import fr.polytechnique.cmap.cnam.etl.events.{Outcome, _} +import fr.polytechnique.cmap.cnam.study.fall.extractors.{Death, Mutation, Transfer} import fr.polytechnique.cmap.cnam.util.functions._ class HospitalizedFracturesSuite extends SharedContext { - "isInCodeList" should "return yes if there is a code with the right start" in { - // Given - val event = McoMainDiagnosis("Pierre", "3", "jambe cassée", makeTS(2017, 7, 18)) - val codes = Set("jam", "bon", "de", "bayonne") - - // When - val result = HospitalizedFractures.isInCodeList(event, codes) - - // Then - assert(result) - } + "filterDiagnosesWithoutDP" should + "filter out LinkedDiagnosis and AssociatedDiagnoses that has not a MainDiagnosis" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ - it should "return yes if there is an exact same code" in { // Given - val event = McoMainDiagnosis("Pierre", "3", "jambe cassée", makeTS(2017, 7, 18)) - val codes = Set("jambe cassée", "bon", "de", "bayonne") + val diagnoses: Dataset[Event[Diagnosis]] = Seq( + McoMainDiagnosis("Pierre", "3", "S02.42", makeTS(2017, 7, 18)), + McoMainDiagnosis("Paul", "1", "S42.54678", makeTS(2017, 7, 20)), + McoMainDiagnosis("Paul", "7", "S02.42", makeTS(2017, 1, 2)), + McoAssociatedDiagnosis("Jacques", "8", "qu'est-ce-que tu fais là?", makeTS(2017, 7, 18)), + McoLinkedDiagnosis("Jacques", "8", "qu'est-ce-que tu fais là?", makeTS(2017, 7, 18)), + McoAssociatedDiagnosis("Paul", "7", "S02.42", makeTS(2017, 1, 2)), + McoLinkedDiagnosis("Pierre", "3", "S02.42", makeTS(2017, 7, 18)) + ).toDS - // When - val result = HospitalizedFractures.isInCodeList(event, codes) + val expected: Dataset[Event[Diagnosis]] = Seq( + McoMainDiagnosis("Pierre", "3", "S02.42", makeTS(2017, 7, 18)), + McoMainDiagnosis("Paul", "1", "S42.54678", makeTS(2017, 7, 20)), + McoMainDiagnosis("Paul", "7", "S02.42", makeTS(2017, 1, 2)), + McoAssociatedDiagnosis("Paul", "7", "S02.42", makeTS(2017, 1, 2)), + McoLinkedDiagnosis("Pierre", "3", "S02.42", makeTS(2017, 7, 18)) + ).toDS + // When + val result = HospitalizedFractures.filterDiagnosesWithoutDP(diagnoses) // Then - assert(result) + assertDSs(result, expected) } - it should "return no if there is no correct code" in { + "getFractureFollowUpStays" should "get the HospitalStay where the CCAM is in CCAMExceptions" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ + // Given - val event = McoMainDiagnosis("Pierre", "3", "jambe cassée", makeTS(2017, 7, 18)) - val codes = Set("avada kedavra", "bon", "de", "bayonne") + val medicalActs: Dataset[Event[MedicalAct]] = Seq( + McoCCAMAct("Paul", "1", "LJGA001", makeTS(2017, 7, 20)), + McoCCAMAct("Paul", "1", "Whatever", makeTS(2017, 12, 20)) + ).toDS - // When - val result = HospitalizedFractures.isInCodeList(event, codes) + val expected: Dataset[HospitalStayID] = Seq( + HospitalStayID("Paul", "1") + ).toDS + // When + val result = HospitalizedFractures.getFractureFollowUpStays(medicalActs) // Then - assert(!result) + assertDSs(result, expected) } + "filterDiagnosisForFracturesFollowUp" should + "return Diagnosis which has not a fracture followup hospital stay" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ - "isFractureDiagnosis" should "return yes for correct CIM10 code" in { // Given - val event = McoMainDiagnosis("Pierre", "3", "S02.35", makeTS(2017, 7, 18)) - - // When - val result = HospitalizedFractures.isFractureDiagnosis(event, AllSites.codesCIM10) + val input: Dataset[Event[Diagnosis]] = List( + McoMainDiagnosis("Paul", "1", "hemorroides", makeTS(2017, 7, 20)), + McoMainDiagnosis("Pierre", "3", "jambe cassée", makeTS(2017, 7, 18)), + McoMainDiagnosis("Pierre", "4", "jambe cassée", makeTS(2016, 7, 18)) + ).toDS - // Then - assert(result) - } + val badStays = Seq( + HospitalStayID("Pierre", "3") + ).toDS - "isMainDiagnosis" should "return yes for correct DP code" in { - // Given - val event = McoMainDiagnosis("Pierre", "3", "whatever", makeTS(2017, 7, 18)) + val expected: Dataset[Event[Diagnosis]] = List( + McoMainDiagnosis("Paul", "1", "hemorroides", makeTS(2017, 7, 20)), + McoMainDiagnosis("Pierre", "4", "jambe cassée", makeTS(2016, 7, 18)) + ).toDS // When - val result = HospitalizedFractures.isMainOrDASDiagnosis(event) + val result = HospitalizedFractures.filterDiagnosisForFracturesFollowUp(badStays)(input) // Then - assert(result) + assertDSs(result, expected) } - it should "return no for other code" in { - // Given - val event = McoLinkedDiagnosis("Pierre", "3", "whatever", makeTS(2017, 7, 18)) + "getFourthLevelSeverity" should "return diagnosis where then patient died at the end of the same hospital stay" in { + val sqlCtx = sqlContext + import sqlCtx.implicits._ - // When - val result = HospitalizedFractures.isMainOrDASDiagnosis(event) + // Given + val input: Dataset[Event[Diagnosis]] = List( + McoMainDiagnosis("Paul", "1", "hemorroides", makeTS(2017, 7, 20)), + McoMainDiagnosis("Pierre", "3", "jambe cassée", makeTS(2017, 7, 18)), + McoMainDiagnosis("Pierre", "4", "jambe cassée", makeTS(2016, 7, 18)) + ).toDS - // Then - assert(!result) - } + val stays: Dataset[Event[HospitalStay]] = List[Event[HospitalStay]]( + McoHospitalStay("Paul", "1", Death.value, 8.0D, makeTS(2017, 7, 20), Some(makeTS(2017, 7, 20))), + McoHospitalStay("Pierre", "3", Mutation.value, 8.0D, makeTS(2017, 7, 18), Some(makeTS(2017, 7, 18))), + McoHospitalStay("Pierre", "4", Transfer.value, 8.0D, makeTS(2016, 7, 18), Some(makeTS(2016, 7, 18))) + ).toDS() - "isBadGHM" should "return yes for correct GHM code" in { - // Given - val event = McoCCAMAct("Pierre", "3", "LJGA001", makeTS(2017, 7, 18)) + val expected: Dataset[Event[Diagnosis]] = List( + McoMainDiagnosis("Paul", "1", "hemorroides", makeTS(2017, 7, 20)) + ).toDS // When - val result = HospitalizedFractures.isBadGHM(event) + val result = HospitalizedFractures.getFourthLevelSeverity(stays)(input) // Then - assert(result) + assertDSs(result, expected) } - "filterHospitalStay" should "return correct dataset" in { + "getThirdLevel" should "return diagnosis where the patient did have a surgery during the same stay" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val input = List( + val input: Dataset[Event[Diagnosis]] = List( McoMainDiagnosis("Paul", "1", "hemorroides", makeTS(2017, 7, 20)), - McoMainDiagnosis("Pierre", "3", "jambe cassée", makeTS(2017, 7, 18)) + McoMainDiagnosis("Pierre", "3", "jambe cassée", makeTS(2017, 7, 18)), + McoMainDiagnosis("Pierre", "4", "jambe cassée", makeTS(2016, 7, 18)) ).toDS - val badStays = Seq( - HospitalStayID("Pierre", "3") - ).toDS + val surgeries: Dataset[Event[MedicalAct]] = List[Event[MedicalAct]]( + McoCCAMAct("Pierre", "5", "jambe", 8.0D, makeTS(2017, 7, 18), Some(makeTS(2017, 7, 18))), + McoCCAMAct("Pierre", "4", "test", 8.0D, makeTS(2016, 7, 18), Some(makeTS(2016, 7, 18))) + ).toDS() - val expected = List( - McoMainDiagnosis("Paul", "1", "hemorroides", makeTS(2017, 7, 20)) + val expected: Dataset[Event[Diagnosis]] = List( + McoMainDiagnosis("Pierre", "4", "jambe cassée", makeTS(2016, 7, 18)) ).toDS // When - val result = HospitalizedFractures.filterHospitalStay(input, badStays) + val result = HospitalizedFractures.getThirdLevelSeverity(surgeries)(input) // Then assertDSs(result, expected) } - "transform" should "return correct Outcome dataset" in { + "assignSeverityToDiagnosis" should "assign a weight to a Diagnosis based on stays and surgeries of the patient" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given - val diagnoses = Seq( - McoMainDiagnosis("Pierre", "3", "S02.42", makeTS(2017, 7, 18)), - McoMainDiagnosis("Paul", "1", "S42.54678", makeTS(2017, 7, 20)), - McoMainDiagnosis("Paul", "7", "hemorroides", makeTS(2017, 1, 2)), - McoAssociatedDiagnosis("Jacques", "8", "qu'est-ce-que tu fais là?", makeTS(2017, 7, 18)) + val input: Dataset[Event[Diagnosis]] = List( + McoMainDiagnosis("Paul", "1", "ColDuFemur", makeTS(2017, 7, 20)), + McoMainDiagnosis("Pierre", "3", "Coude", makeTS(2017, 7, 18)), + McoMainDiagnosis("Pierre", "4", "Poignet", makeTS(2016, 7, 18)) ).toDS - val medicalActs = Seq( - McoCCAMAct("Paul", "1", "LJGA001", makeTS(2017, 7, 20)) - ).toDS + val surgeries: Dataset[Event[MedicalAct]] = List[Event[MedicalAct]]( + McoCCAMAct("Pierre", "4", "test", 8.0D, makeTS(2016, 7, 18), None) + ).toDS() + + val stays: Dataset[Event[HospitalStay]] = List[Event[HospitalStay]]( + McoHospitalStay("Paul", "1", Death.value, 8.0D, makeTS(2017, 7, 20), Some(makeTS(2017, 7, 20))) + ).toDS() - val expected = Seq( - Outcome("Pierre", "AllSites", "hospitalized_fall", makeTS(2017, 7, 18)) + val expected: Dataset[Event[Diagnosis]] = List( + McoMainDiagnosis("Paul", "1", "ColDuFemur", 4D, makeTS(2017, 7, 20), None), + McoMainDiagnosis("Pierre", "3", "Coude", 2D, makeTS(2017, 7, 18)), + McoMainDiagnosis("Pierre", "4", "Poignet", 3D, makeTS(2016, 7, 18)) ).toDS // When - val result = HospitalizedFractures.transform(diagnoses, medicalActs, List(AllSites)) + val result = HospitalizedFractures.assignSeverityToDiagnosis(stays, surgeries)(input) + // Then assertDSs(result, expected) } - "transform" should "return correct weight" in { + "transform" should "return Fractures Event Dataset based on the algorithm" in { val sqlCtx = sqlContext import sqlCtx.implicits._ // Given val diagnoses = Seq( - McoMainDiagnosis("Pierre", "3", "S02.42", 2.0, makeTS(2017, 7, 18)), - McoMainDiagnosis("Jean", "2", "S02.42", 3.0, makeTS(2017, 7, 18)), - McoMainDiagnosis("Kevin", "4", "S02.42", 4.0, makeTS(2017, 7, 18)), - McoMainDiagnosis("Paul", "1", "S42.54678", 2.0, makeTS(2017, 7, 20)), - McoMainDiagnosis("Paul", "7", "hemorroides", 2.0, makeTS(2017, 1, 2)), - McoAssociatedDiagnosis("Jacques", "8", "qu'est-ce-que tu fais là?", 2.0, makeTS(2017, 7, 18)) + McoMainDiagnosis("Pierre", "3", "S02.42", makeTS(2017, 7, 18)), + McoMainDiagnosis("Paul", "1", "S42.54678", makeTS(2017, 7, 20)), + McoMainDiagnosis("Paul", "7", "S02.42", makeTS(2017, 1, 2)), + McoMainDiagnosis("Charlotte", "9", "S02.42", makeTS(2017, 10, 22)), + McoAssociatedDiagnosis("Jacques", "8", "qu'est-ce-que tu fais là?", makeTS(2017, 7, 18)) ).toDS + val surgeries: Dataset[Event[MedicalAct]] = List[Event[MedicalAct]]( + McoCCAMAct("Pierre", "3", "test", 8.0D, makeTS(2016, 7, 18), None) + ).toDS() + + val stays: Dataset[Event[HospitalStay]] = List[Event[HospitalStay]]( + McoHospitalStay("Paul", "7", Death.value, 8.0D, makeTS(2017, 1, 2), Some(makeTS(2017, 1, 3))) + ).toDS() + val medicalActs = Seq( McoCCAMAct("Paul", "1", "LJGA001", makeTS(2017, 7, 20)) ).toDS - val expected = Seq( - Outcome("Pierre", "AllSites", "hospitalized_fall", 2.0, makeTS(2017, 7, 18)), - Outcome("Jean", "AllSites", "hospitalized_fall", 3.0, makeTS(2017, 7, 18)), - Outcome("Kevin", "AllSites", "hospitalized_fall", 4.0, makeTS(2017, 7, 18)) + val expected: Dataset[Event[Outcome]] = Seq[Event[Outcome]]( + Outcome("Pierre", "AllSites", "hospitalized_fall", 3D, makeTS(2017, 7, 18), None), + Outcome("Paul", "AllSites", "hospitalized_fall", 4D, makeTS(2017, 1, 2), None), + Outcome("Charlotte", "AllSites", "hospitalized_fall", 2D, makeTS(2017, 10, 22), None) ).toDS // When - val result = HospitalizedFractures.transform(diagnoses, medicalActs, List(AllSites)) + val result = HospitalizedFractures.transform(diagnoses, medicalActs, stays, surgeries, List(AllSites)) // Then assertDSs(result, expected) } diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/LiberalFracturesSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/LiberalFracturesSuite.scala index 960370e0..466a5c83 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/LiberalFracturesSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/LiberalFracturesSuite.scala @@ -20,10 +20,10 @@ class LiberalFracturesSuite extends SharedContext { DcirAct("Sam", "3", "4561", makeTS(2015, 7, 18)) ).toDF.as[Event[MedicalAct]] val expected = Seq( - Outcome("Pierre", "Clavicule", "Liberal", makeTS(2017, 7, 18)), - Outcome("Ben", "MembreSuperieurDistal", "Liberal", makeTS(2017, 7, 18)), - Outcome("Sam", "CraneFace", "Liberal", makeTS(2015, 7, 18)), - Outcome("Sam", "undefined", "Liberal", makeTS(2015, 7, 18)) + Outcome("Pierre", "Clavicule", "Liberal", 1D, makeTS(2017, 7, 18), None), + Outcome("Ben", "MembreSuperieurDistal", "Liberal", 1D, makeTS(2017, 7, 18), None), + Outcome("Sam", "CraneFace", "Liberal", 1D, makeTS(2015, 7, 18), None), + Outcome("Sam", "undefined", "Liberal", 1D, makeTS(2015, 7, 18), None) ).toDF.as[Event[Outcome]] //When diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/PrivateAmbulatoryFracturesSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/PrivateAmbulatoryFracturesSuite.scala index 3cee7b37..5fe5e161 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/PrivateAmbulatoryFracturesSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/PrivateAmbulatoryFracturesSuite.scala @@ -2,8 +2,9 @@ package fr.polytechnique.cmap.cnam.study.fall.fractures +import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.events.{DcirAct, Outcome} +import fr.polytechnique.cmap.cnam.etl.events.{DcirAct, Event, Outcome} import fr.polytechnique.cmap.cnam.util.functions._ class PrivateAmbulatoryFracturesSuite extends SharedContext { @@ -53,8 +54,8 @@ class PrivateAmbulatoryFracturesSuite extends SharedContext { DcirAct("loulou", DcirAct.groupID.PublicAmbulatory, "stupidcode", makeTS(2007, 1, 1)) ).toDS - val expected = Seq( - Outcome("riri", "FemurExclusionCol", PrivateAmbulatoryFractures.outcomeName, makeTS(2007, 1, 1)) + val expected: Dataset[Event[Outcome]] = Seq[Event[Outcome]]( + Outcome("riri", "FemurExclusionCol", PrivateAmbulatoryFractures.outcomeName, 1D, makeTS(2007, 1, 1), None) ).toDS // When diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/PublicAmbulatoryFracturesSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/PublicAmbulatoryFracturesSuite.scala index d63514a1..b2f9e8d1 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/PublicAmbulatoryFracturesSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/study/fall/fractures/PublicAmbulatoryFracturesSuite.scala @@ -2,15 +2,16 @@ package fr.polytechnique.cmap.cnam.study.fall.fractures +import org.apache.spark.sql.Dataset import fr.polytechnique.cmap.cnam.SharedContext -import fr.polytechnique.cmap.cnam.etl.events.{DcirAct, McoCEAct, McoCIM10Act, Outcome} +import fr.polytechnique.cmap.cnam.etl.events.{DcirAct, Event, McoCeCcamAct, McoCIM10Act, Outcome} import fr.polytechnique.cmap.cnam.util.functions.makeTS class PublicAmbulatoryFracturesSuite extends SharedContext { "isPublicAmbulatory" should "return true for correct events" in { // Given - val event = McoCEAct("georgette", "ACE", "angine", makeTS(2010, 2, 6)) + val event = McoCeCcamAct("georgette", "ACE", "angine", makeTS(2010, 2, 6)) // When val result = PublicAmbulatoryFractures.isPublicAmbulatory(event) @@ -32,7 +33,7 @@ class PublicAmbulatoryFracturesSuite extends SharedContext { "containsNonHospitalizedCcam" should "return true for correct events" in { // Given - val event = McoCEAct("georgette", "ACE", "MZMP007", makeTS(2010, 2, 6)) + val event = McoCeCcamAct("georgette", "ACE", "MZMP007", makeTS(2010, 2, 6)) // When val result = PublicAmbulatoryFractures.containsNonHospitalizedCcam(event) @@ -47,13 +48,13 @@ class PublicAmbulatoryFracturesSuite extends SharedContext { // Given val events = Seq( - McoCEAct("georgette", "ACE", "MZMP007", makeTS(2010, 2, 6)), - McoCEAct("george", "ACE", "whatever", makeTS(2010, 2, 6)), + McoCeCcamAct("georgette", "ACE", "MZMP007", makeTS(2010, 2, 6)), + McoCeCcamAct("george", "ACE", "whatever", makeTS(2010, 2, 6)), DcirAct("john", "ACE", "MZMP007", makeTS(2010, 2, 6)) ).toDS - val expected = Seq( - Outcome("georgette", "MembreSuperieurDistal", PublicAmbulatoryFractures.outcomeName, makeTS(2010, 2, 6)) + val expected: Dataset[Event[Outcome]] = Seq[Event[Outcome]]( + Outcome("georgette", "MembreSuperieurDistal", PublicAmbulatoryFractures.outcomeName, 1D, makeTS(2010, 2, 6), None) ).toDS // When diff --git a/src/test/scala/fr/polytechnique/cmap/cnam/study/pioglitazone/outcomes/NaiveBladderCancerSuite.scala b/src/test/scala/fr/polytechnique/cmap/cnam/study/pioglitazone/outcomes/NaiveBladderCancerSuite.scala index dd1771b0..b9984f17 100644 --- a/src/test/scala/fr/polytechnique/cmap/cnam/study/pioglitazone/outcomes/NaiveBladderCancerSuite.scala +++ b/src/test/scala/fr/polytechnique/cmap/cnam/study/pioglitazone/outcomes/NaiveBladderCancerSuite.scala @@ -18,7 +18,7 @@ class NaiveBladderCancerSuite extends SharedContext { McoMainDiagnosis("PatientA", "C67", makeTS(2010, 1, 1)), McoLinkedDiagnosis("PatientA", "C67", makeTS(2010, 2, 1)), McoAssociatedDiagnosis("PatientA", "C67", makeTS(2010, 3, 1)), - ImbDiagnosis("PatientA", "C67", makeTS(2010, 4, 1)), + ImbCcamDiagnosis("PatientA", "C67", makeTS(2010, 4, 1)), McoMainDiagnosis("PatientA", "ABC", makeTS(2010, 5, 1)) ).toDS