# O4 lines files documentation

Continuous gravitational wave analyses wishing to exclude narrow
frequency bands that are confirmed to be contaminated by
non-astrophysical artifacts should utilize the CSV files in this
directory for the O4 epoch.

## Reading lines files contents

The first line is a header describing the contents of each column.

- **Frequency or frequency spacing [Hz]** for lines this is the most
  prominent feature of the artifact; for combs, this is the comb
  spacing. Units of hertz
- **Type** integer value for lines (0), combs with fixed width
  as a function of frequency (1), combs with linearly increasing width
  as a function of frequency (2), band with known time-limits that is
  impacted by many narrow spectral artifacts (3)
- **Frequency offset [Hz]** frequency offset of the artifact in units
  of hertz. For lines, this value is always 0 Hz; for combs, this
  value may be non-zero so that a particular comb tooth may have
  frequency equal to n x comb frequency + frequency offset.
- **First visible harmonic** for lines is always an integer value of
  1; for combs, this is the first frequency the comb is visible and is
  equal to first visible harmonic x comb frequency + frequency offset
- **Last visible harmonic** for lines is always an integer value of
  1; for combs, this is the last frequency the comb is visible and is
  equal to last visible harmonic x comb frequency + frequency offset
- **Left width [Hz]** is the frequency band to lower frequency values
  from the frequency (for lines) or the comb tooth frequency (for
  combs) so that frequency - left width equals the lowest frequency
  that the artifact is impacting for that particular narrow artifact
- **Right width [Hz]** is the frequency band to higher frequency
  calues from the frequency (for lines) or the comb tooth frequency
  (for combs) so that the frequency + right width equals the highest
  frequency that the artifact is impacting for that particular narrow
  artifact
- **Comments** contains any notes or important information about the
  artifact
- **Segments known to be present** GPS time segments that the artifact
  is known to be present, where the list is given as (GPS start, GPS
  stop) tuples. Where no list is given, the list is empty "[]", or the
  list is "\*", the artifact should be considered to be always present
  or has unknown time variation

## Using the line files information as vetoes

Frequency bands that are impacted by vetted non-astrophysical
artifacts are determined from these files, i.e., [start frequency, stop frequency]:

- **Lines** (type 0)
  ```math
  [frequency - left width, frequency + right width]
  ```
- **Fixed width combs** (type 1)
  ```math
  [harmonic * frequency + offset - left width, harmonic * frequency + offset + right width]
  ```
- **Scaling width combs** (type 2)
  ```math
  [(harmonic * frequency + offset) * (1 - left width / (first harmonic * frequency + offset)),
   (harmonic * frequency + offset) * (1 + right width / (first harmonic * frequency + offset))]
  ```
- **Contaminated bands containing multiple artifacts** (type 3)
  ```math
  [frequency - left width, frequency + right width]
  ```

**Note 1:** Some line and comb artifacts (types 0, 1, and 2) have segment information with (GPS start, GPS stop) times listed for when contamination is present. Times outside these intervals will have reduced or no observed contamination (though not guaranteed). In the comments for those artifacts with segment information, there are two useful pieces of information to consider:
1. "segments determined from aLOGs/Fscans" typically means that this was associated with specific hardware changes
1. "segments determined from Fscans" typically measns that we have observed changes, but we do not have a direct cause for those changes

Due to excessive segmentation of the data, it may be logistically impractical for analyses to remove many narrow frequency bands for type 0, 1, and 2 artifacts that have segment information before analysis of the data. Instead, for those artifacts that have segment information, we recommend that outlier follow-up should exclude the contaminated times. Meaning, veto contaminated time/frequency bands from the input data to the analysis and rerun to determine the SNR of the candidate. If this is still logistically challenging, then those artifacts with segment information can be treated as though the artifacts were continually present.

**Note 2:** Contaminated bands containing multiple artifacts (type 3)
typically still contain useful data, even during contaminated times, in frequency
regions between artifacts. However, we were unable to confidently
catalog all artifacts in these bands individually, and therefore
recommend treating the entire band with caution. It may be better to exclude these
bands from initial analysis during the contaminated times, especially if the time
period of contamination is short relative to the total observing time. Analyses should
consider the feasibility of this approach, and be aware of the increased possibility of
spurious outliers if the data during contaminated times is analyzed.

## Lines list generation notes

- Files that can be edited are labeled `_WIP_` (do not edit `_AUTOWIP_` files directly, they are for reference)
- Commas should be used ONLY for separation of fields. Do not add commas in line descriptions.
