Progress in managing the transition from the RS92 to the Vaisala RS41 as the operational radiosonde within the GCOS Reference Upper-Air Network

Ruud J. Dirksen1, Greg E. Bodeker2, Peter W. Thorne3, Andrea Merlone4, Tony Reale5, Junhong Wang6, Dale F. Hurst7, Belay B. Demoz8, Tom D. Gardiner9, Bruce Ingleby10, Michael Sommer1, Christoph von Rohden1, and Thierry Leblanc11 1GRUAN Lead Centre, Deutscher Wetterdienst, Meteorologisches Observatorium Lindenberg, Am Observatorium 12, 15848 Tauche/Lindenberg, Germany 2Bodeker Scientific, Alexandra, New Zealand 3Maynooth University, Maynooth, Ireland 4INRI, Turin, Italy 5NOAA/NESDIS, Washington DC, USA 6Department of Atmospheric and Environmental Sciences, State University of New York, Albany, USA 7NOAA Earth System Research Laboratory, Boulder, CO, USA 8Howard University, Washington D.C., USA 9National Physical Laboratory, Teddington, UK 10ECMWF, Reading, UK 11JPL, Pasadena, USA

its end of production, Vaisala's RS92 radiosonde was also widely used outside of GRUAN, with a global market share of approximately 30% (including at least daily launches at sites on every continent). Its performance was among the best of the commercially available radiosonde models 50 (Nash et al., 2011). Until recently, the majority of the 27 GRUAN sites employed the RS92 (listed in Table 1  upper-air sounding. Any change in instrumentation in a GOS (Global Observing System) network not only presents potential data continuity concerns, but it will also pose a challenge to a far broader community of users. Radiosounding data form a key input to NWP systems and human forecasts 55 such that any difference in performance has potentially large impacts. The challenge is to ensure continuity of operations without negative scientific or financial ramifications. The manual on the GOS states that "Changes of bias caused by changes in instrumentation should be evaluated by a sufficient period of observation (perhaps as much as a year) or by making use of the results of instrument intercomparisons made at designated test sites" (Section 2.2.2.13 of WMO, 2017). 60 One of the key potential benefits of a tiered networks design (Bodeker et al., 2016) is the dissemination of information derived from a subset of top-tier reference quality sites down to the geographically broader lower-tier network sites. Figure 2 schematically depicts a three-tiered upper-air observing system architecture, with a 30-40 station GRUAN network providing reference observations for more extensive networks such as the GCOS upper-air network (GUAN). Reference network 65 sites serve as the long-term anchor points that comprehensively characterize the atmospheric column with the highest quality measurements currently feasible. The base of the system is the entire global upper-air observing system, serving a wide variety of purposes, primarily weather prediction, and including the operational radiosonde network, aircraft and satellite observations, etc., and embracing model-assimilated upper-air datasets and reanalyses. The 177-station (as of March 2019) GUAN is a 70 subset of the operational radiosonde network that, in the late 1990s, committed to long-term, consistent observations, but does not deploy any special instruments for high-quality climate observations as GRUAN does.
In the case of the transition from RS92 to other radiosonde types, the lessons learnt from GRUAN activities to manage the transition may benefit those GUAN sites faced with the same challenge as 75 well as other sonde stations from the remainder of the GOS. Furthermore, by undertaking an intensive characterization of the transition, GRUAN can assist not just the climate community but also other communities such as numerical weather prediction (NWP)/forecasting through active dissemination of the resulting analyses of any effects of the transition. Such an approach requires visibility RS92 radiosondes and their successors.

90
-Offers of assistance from the expert user community in the synthesis of twin launch data from multiple sites.
-Suggestions of additional analyses that could be performed.
The remainder of this paper outlines various aspects of the change strategy, such as a network-wide approach including burden-sharing, the application of ancillary data, the metrological perspective 95 on the change, the role of documentation and the creation of a scientific database which stores all measurement data which are relevant to the change. Furthermore, it reports on the progress to date and current plans to diagnose biases between RS92 and RS41 radiosondes and provides some initial results based on analysis of 224 twin soundings that have been performed at the Lindenberg Observatory. The data of the laboratory experiments that were used for the plots in this paper are 100 stored in a permanent repository with digital object identifier (DOI) 10.5676/GRUAN/dpkg-2019-1 (Dirksen and Sommer, 2019).

The challenge
For any long-term measurement series, inevitably, the challenge of change management is certain to arise. This could be either through choice when improved or more efficient means of making during all four seasons. The study highlighted the importance of undertaking a sustained programme (i.e. >1 year) of coincident soundings by old and new instrumentation to understand any seasonality 125 of biases. Seasonally dependent biases may arise from changes in the measured ECVs and/or annual cycles of covariates such as radiation effects which are particularly important at those sites where launches systematically occur at or near dusk/dawn. The second switch, RS92 to RS-11G, involved 52 twin soundings over a period of two years (Kobayashi et al., 2019).
There are some crucial distinctions between this precursor analysis at Tateno and the current 130 GRUAN-wide transition from RS92 to RS41 radiosondes. For Tateno: -At the time, neither the original nor the replacement sonde models had GRUAN data products being processed and provided routinely to the user community.
-The change related to a single instrument at a single site.
-The update arose from a choice by the site to change instrumentation such that the timetable 135 could be altered as necessary.
Although the matter of broader change management pertaining to simultaneous instrument transitions at multiple sites has been informally discussed on various occasions, e.g. during GRUAN's annual Implementation and Coordination Meetings (ICMs), prior to the RS92 cessation of production, there existed no formal plan for managing such a wide-spread change. This large-scale transition 140 poses a major challenge for GRUAN as a reference network because it must not compromise the continuity, quality, and homogeneity of the data records.
This change of the operational radiosonde at the majority of the GRUAN sites is unprecedented in the history of GRUAN. A network-wide challenge requires a facilitated and coordinated solution if the change is to be successful, and if GRUAN is to succeed. The fundamental challenge is thus 145 to design and deliver a GRUAN-wide strategy for managing and coordinating the near-simultaneous changes of the operational radiosonde at many sites. This strategy should include all aspects of the change management including inter-alia: network coordination to share the burden, the necessary roles of ancillary measurements coincident with the sonde measurements, Since GRUAN always seeks to promote competition in the marketplace, sites were encouraged to consider all available options for how to proceed with their transition from the RS92, including changing to sondes produced by other manufacturers. Despite the independent decision-making pro-160 cedure, each GRUAN site that launched RS92 sondes has transitioned to the RS41. It is important to stress that these decisions arose from the individual GRUAN sites and not from the network management. The consequence is that the network must transition between two instrument models from the same manufacturer. If some sites had chosen to switch to radiosondes from different manufacturers, GRUAN would have been required to develop a very different change management programme than 165 the one described here.
Two of GRUAN's strengths are its ability to call on expertise from across the network to tackle such challenges and its ability to distribute required actions among the sites to share the burden.
Currently, specialists from various fields of expertise within GRUAN are engaged in addressing the above mentioned points. This not only advantages the affected sites and GRUAN as a whole, but 170 also helps other observational networks, such as e.g. GUAN, in managing the same transition.

Change management
Proper management of the change of a measurement system requires determining all relevant differences between both systems prior to the transition. Typically this means organizing a period of observational system overlap as well as laboratory-based characterization of the differences between 175 the instruments. While the specialized facilities to perform extensive laboratory testing are not available at each site, there is no impediment to sites performing real-world intercomparisons like twin soundings, although the costs of extra receiving systems and sondes may pose limits on the number of flights that can be performed.
It is essential to quantify biases between the new and the old instruments as well as changes in 180 calibration/measurement errors and uncertainties. These attributes may have complex interactions with covariates which complicate the quantification of the effects of the change. One example of such a measurement error, the solar radiation-induced temperature bias, will vary with altitude, season and geographical location, because it depends on the ambient pressure, solar elevation angle and radiation intensity (Dirksen et al., 2014). The full range of sources of RS92 uncertainties that 185 may have complex spatio-temporal characteristics, including ventilation, sensor orientation, and prelaunch calibration, have been described in detail by Dirksen et al. (2014). These uncertainties will also vary with location due their dependence on solar elevation angle, cloudiness, and winds. Once the biases have been identified and corrected for, and the uncertainties have been determined, the data (m 1 and m 2 ) from both measurement systems should be consistent, meaning that the agreement  Immler et al., 2010). Another way of saying this is that the measurement data consistently lie within each other's uncertainty coverage factors (u 1 and u 2 ) after accounting for any effects of non-coincidence. operation strategy would be required, which depends on (at a minimum): the instrument characteristics, variability in the target measurand, and cost and logistical considerations. Kobayashi et al. (2012) suggest that a total of 120 twin soundings spread across the seasonal cycle, at a given location, 205 would be more than sufficient to characterize the effects of a change in radiosonde instrumentation, although the study did not consider the metrological quantification aspects that are necessary in the current case. Hence, for radiosondes, the GRUAN Lead Centre recommended that sites perform weekly, or with a bi-weekly interval, twin soundings for a period of two years. The two-year period ensures that seasonality is better probed, and a sounding interval of a week instead of a day mitigates 210 the additional operational costs. The twin soundings should be equally distributed between day and night soundings.
Coordination with satellite overpass is highly encouraged, with particular emphasis on targeting times when a GNSS-RO and polar orbiter overpass may occur in the site's proximity. The sonde-satellite data comparison would also serve as an additional station-to-station transfer-check of 215 consistency of the results. This is further discussed in Section 7.
Laboratory testing is used to characterize the radiosonde's sensors, as is done to establish a GRUAN data product for each radiosonde model. The results of the laboratory tests for the RS92-RS41 transition are being employed in developing a GRUAN data product for the RS41. The parameters tested include: radiation error for the temperature and humidity sensors, sensor calibration 220 accuracy, and response time-lag and hysteresis effects of the humidity sensor. The Lead Centre facility, hosted by DWD at the Lindenberg Observatory, has access to a broad range of laboratory-based facilities suitable for characterizing radiosonde performance. These facilities, shown in Figure 3, were used to characterize the RS92 instrument as described in Dirksen et al. (2014) and include: standard humidity chambers (SHCs) validate RH sensor calibration,
climate chamber operating between -75 • C and +20 • C for sensor response time-lag testing.
Section 5 gives a more detailed discussion of the application of these facilities to the characterization of radiosondes, together with an overview of preliminary results from these laboratory tests. This approach to burden sharing will promote quality over quantity, as it is envisaged that some sites will opt only for a limited intercomparison period, or perhaps only a short campaign-like effort.
On their own, these limited intercomparison efforts would be insufficient to properly investigate the seasonality of the differences between the old and the new sounding systems. But as a network, the sum of these small contributions becomes substantial. Indian Monsoon, RS92-RS41 twin soundings were performed to investigate the differences between both systems under these particular meteorological conditions.
All data from the intercomparisons listed in Tables 1 and 2 will be made available to the scientific community via the GRUAN data servers, as discussed further in Section 8.3.

Support equipment
To support data telemetry for RS41 radiosonde measurements at GRUAN sites, the Lead Centre has a spare, fully-equipped radiosounding receiving system that can be temporarily loaned to sites that cannot afford to purchase a second receiving station. This fully functioning system consists of an antenna, an MW41 receiving system, and a compact Vaisala WXT weather station for collecting 270 metadata surface observations at the time of the launch. In addition to this hardware, an SHC can be loaned for performing manufacturer-independent pre-launch checks in a 100 %RH environment, as discussed by Dirksen et al. (2014).
Furthermore, Vaisala has several MW41 systems on hand that can be loaned to sites that wish to conduct short to medium-term RS92-RS41 intercomparison campaigns. Various GRUAN sites have The laboratory facilities at the Lead Centre/Lindenberg Observatory, photographically presented in Figure 3, are being used for extensive testing to characterize the measurement errors and uncertainties of the RS41 and the RS92, an activity which is essential for the development of a GRUAN data products for both radiosondes. The tests primarily focus on the error sources which are known to be dominant for radiosondes, i.e. the solar radiation heating of the temperature and humidity sensors, 285 the response time-lag, and the accuracy and reproducibility of the humidity sensor's calibration.
Radiation error tests are performed in an adapted SHC at pressures between ambient and 3 hPa (see Dirksen et al. (2014) for a description of the radiation tests and their configuration) as well as in a newly developed system that allows for improved ventilation and illumination. Preliminary radiation tests were performed on RS41 radiosondes from 2014 through 2019, and further tests are 290 foreseen for 2020. The results indicate that the temperature sensor of the RS41 radiosonde is less susceptible to heating by solar radiation than that of the RS92. However, these results apply to raw (uncorrected) measurement data and it is not possible to draw direct conclusions on the resulting temperature bias between the Vaisala-processed data products of RS92 and RS41 since in the data processing of both sondes different corrections are applied for the radiative heating. The calibration accuracy and hysteresis effects of the humidity sensor have been investigated by placing the radiosonde's sensor boom in an SHC having a stable, well-defined RH between 0 % and 100 %. The stable RH environment inside each SHC is achieved using one of the saline mixtures listed in Table 4. In addition, each SHC is equipped with a Pt100 reference thermometer which tests the calibration accuracy of the radiosonde temperature sensor.  Table 3 summarizes the laboratory experiments that have been performed to date to characterize the RS41 radiosonde.
The lab-based characterization results will be included in the scientific database holding all data that are relevant to the RS92 to RS41 transition (discussed in Section 8.3).

Results of the laboratory characterization 310
To assess the calibration accuracy of the humidity and temperature sensors, more than 150 RS41 radiosondes from various production batches were tested in SHCs at the relative humidities listed in Table 4. In a typical experiment, each RS41 was sequentially placed in a series of six SHCs with increasing relative humidity, from 0 to 100 %, then back through the sequence of drier SHCs to 0 %RH. This sequence also allows assessment of hysteresis of the humidity sensor. At each humidity 315 level, the radiosonde was immersed in the SHC for approximately 4 minutes while its readings were recorded. The air inside each SHC was circulated at 5 m/s by a fan, and the temperatures of the air and saline solution inside the SHCs were measured by Pt100 reference thermometers. For some salts both air and solution temperatures are needed to accurately determine the relative humidity in the SHC, since the humidity over the reference salt depends on both quantities. 320 Figure 4 shows that the majority of the temperature measurements by the RS41 are within ±0.5 K of the reference temperature. Although the tail of the distribution extends to 1 K (not shown), only a few measurements show differences beyond 0.5 K. The mode of the distribution indicates a bias of -0.025 K, while from its width it can be inferred that the calibration uncertainty is smaller than 0.1 K. The histogram of temperature differences between the RS41 and the Pt100 reference is not 325 Gaussian, something that cannot yet be explained but will be the subject of further investigation. Figure 5 shows that all humidity measurements by the RS41 are within 2 %RH of the reference RH value with only a small fraction of the measurements showing differences larger than 1 %RH, which means that the uncertainty of the humidity calibration is smaller than 1 %RH. The histogram of RH differences between the RS41 and the reference RH value has a pronounced peak around 0 %RH, spread, between the reference and the RS41 values are small, whereas at high RH the differences and spread are larger. As a result, all observed calibration errors at 0 %RH humidity cluster around 0 %, thereby over-representing these values in the distribution.
Radiation tests were conducted in the modified SHC as described by Dirksen et al. (2014).  surements were performed at various settings within the ranges: pressure between 3 hPa and ambient, irradiance from 200 to 1000 W/m 2 , ventilation speed of either 2.5 or 5 m/s, illumination times of 1-4 minutes.

340
The illumination times depended on pressure, with longer exposures required for lower pressure environments to reach equilibrium. Figure 6 shows similar heating of the RS92 and RS41 temperature sensors at 300 hPa (0.15 K), but at 3 hPa ( Figure 7) the RS41 (1.4 K) heats only half as much as the RS92 (2.8 K). Furthermore, at 3 hPa, the RS41 sensor has reached equilibrium after approximately 30 s of illumination, whereas after 4 minutes of illumination the RS92 still hasn't reached 345 equilibrium. The investigation of the RS41's radiation error continues, including experiments with the newly developed laboratory radiation system, and the full analysis of these tests will be reported in a separate paper.

Metrology
A fundamental metrological principle stipulates that replacing one operational instrument with an-350 other should pose no problem provided that the results from both instruments are fully traceable to SI standards. Consequently, the new instrument could (almost) instantaneously be included in the traceability chain without the need for parallel testing or comparison with the replaced device. In practice, this idealized concept can rarely be adopted, even in primary metrology laboratories or in National Metrology Institutes. The problem is that different instruments or sensors may show 355 different responses to external environmental factors.
Concerning radiosondes, the sensors, especially for humidity and temperature, may during a sounding be exposed to unavoidable atmospheric or ascent-related effects. Some of these effects cannot fully be included in the realization of the controlled laboratory conditions which are provided during the preceding metrological instrument characterization and calibration procedures.

360
Consequently, differences in the responses of sensors from different sonde models may still exist during ascents. An example is the warm bias of the radiosonde's temperature sensor caused by solar radiation.
For this reason it is essential to identify and quantify these differences between the old and the new measurement system at each time of a sonde replacement not only by laboratory work but also by have not yet been correctly assessed or even identified. To assure consistency of the measurement results of different instruments, the differences are to be evaluated in terms of possible corrections, or, if this is not possible, the uncertainty budget is to be extended properly to account for them. It is important to arrange the comparisons network-wide in a coordinated manner, covering different 390 time scales and locations. This is to ensure that as much aspects as possible are covered which may introduce systematics such as latitude, climate, environmental and technical conditions before and during launch, local specifics in the sounding procedures or setups.
An advantage of performing dual launches of RS92 and RS41 radiosondes is that some of the uncertainties related to the change process are moved from absolute evaluations to relative ones, ferred to as ancillary data. These ancillary data add significant to the analysis of the twin-launch data, because they form an independent source of data that is not affected in the same way by the error sources that are typical for radiosoundings and therefore can be used to validate the radiosonde intercomparison data. Furthermore, within one orbit and among a limited number of consecutive orbits, satellites provide a consistent background on a global scale. However, the long-term calibration 425 drifts and retrieval errors of space-borne instruments must be taken into account when comparing long-term data sets of coincident radiosoundings and satellite overpasses.
The GRUAN Lead Centre, in cooperation with the GRUAN Task Team Ancillary Measurements, is working with each GRUAN site to establish respective ancillary measurement data streams and ascertain which of these streams contain relevant data that could be used to support the RS92 to 430 RS41 transition. A key goal is to establish scheduling and sampling protocols to provide ancillary information that is spatially and temporally synchronized and internally consistent (Equation 1).
Protocols are being developed and deployed to ensure that such data are submitted to the scientific database (see Section 8.3) and tagged as ancillary information to facilitate future analyses.
The use of ancillary data from GRUAN sites in such a manner is currently in an early stage of the processing, packaging and use of the ancillary data in a spatially and temporally coherent manner is a complicated task that remains under discussion.
The currently proposed plan is that the Lead Centre will identify each twin launch for the NOAA NPROVS team, including site-specific ancillary data, once coincident measurement strategies and data streams are established. The NPROVS team will append level-2 (geophysical profile) data 450 from satellites with special emphasis on those targeted for satellite overpasses and provide routine monitoring, analysis and distribution.

Scheduling
In addition to ground-based ancillary measurements such as GNSS, lidar, MWR, and FTIR, satellitebased observations also present a valuable and abundant source of additional observations for com-455 parisons with radiosounding data. To maximize their potential for scientific exploitation, the radiosonde launches should be scheduled to be coincident with satellite overpasses and/or the occurrence of GNSS-radio occultations (GNSS-RO) over the site. This would maximize the informational content available from instrumentation both on-site and arising from satellite-based capabilities.

460
ical Satellites (EUMETSAT) provides predictions of these overpasses, including so-called "golden overpasses" where the measurements of a polar orbiter, such as MetOp-A or MetOp-B, are spatially (distance < 200 km) and temporally (within 30 minutes) coincident with a GNSS-RO measurement.
In addition, at GRUAN sites where ground-based ancillary measurements are also being made, there could be multiple redundant observations of target ECVs to enable better exploitation; see Sec-

490
The decreasing differences between the radiosonde and both the satellite and the model data is tentatively interpreted as the RS41 providing better RH measurements in the upper troposphere than the RS92. However, validation by additional, independent measurements is needed to substantiate this.
Figure 8 also shows that the bias between radiosonde and satellite/model data in the UT is up to 495 a factor two smaller for Europe than for Lauder. In addition, the shape of the difference profiles for Lauder and Europe are different and analysis of this discrepancy is ongoing.
The finding that the RS41 measures higher humidity values in the UT than the RS92 is also consistent with results for RS92-RS41 twin soundings that were performed at the GRUAN site in The collocated observations also permit assessment of calculated radiances, derived from radiosonde profiles using radiative transfer models, versus observed satellite radiances. Calbet et al.
(2017) used this method to evaluate RS92 data, and when applying the same method to RS41 data it 505 can provide additional information on the differences between RS92 and RS41.
In summary, scheduling and targeting RS41-RS92 twin soundings with satellite overpass brings more systems into the comparison and can enhance the transition analysis and provides more robust The wide range of research activities investigating the RS92-RS41 transition that are outlined in this paper will result in a substantive, and valuable, data archive.
The analysis of this data archive will be done from various perspectives by scientists with different 520 areas of expertise. Their results will need to be shared with the atmospheric science community. Table 5 lists a preliminary allocation of research analysis tasks and their principal investigators.
With this multi-disciplinary approach to the analysis of the data it is anticipated that the differences between the RS41 and RS92 radiosondes will be well understood and that inhomogeneities in their combined long-term data records can be minimized.

525
The list in Table 5 is incomplete and will be expanded as needs and requirements become clearer.
The aim is to publish several distinct papers that describe the results. As further outlined in Section 9, we strongly welcome engagement by the broader atmospheric science community to analyze and publish the results.

530
Up to September 2019, approximately 1500 RS92-RS41 twin soundings have been performed within GRUAN. A comprehensive analysis of this extensive data set is still ongoing, but as an example of this larger effort we present the preliminary results of the twin soundings performed at Lindenberg.
According to Table 1 Dirksen et al., 2014) are used whereas the RS41 data are processed by the Vaisala MW41 system. Figure 9 shows the profile for such a twin sounding performed during daytime. The difference 545 plot (right-hand panel) shows that up to the tropopause (in this particular case at approximately 14 km) both sondes capture the variations and structures in the humidity profile equally well, and the RS41 reports slightly higher (up to 2% RH) humidity values than the RS92. Above the tropopause, the RS41 reports lower RH values than RS92, which is attributed to the shorter time-lag of the RS41 humidity sensor that is better able to capture the steep negative gradient in water vapor at the 550 tropopause.
The plots in Figure 10 (left-hand panel) show that for nighttime measurements the absolute temperature differences between the two sonde models are generally smaller than 0.05 K up to 30 km altitude with the RS92 (GRUAN-processed data) reporting slightly higher temperatures than RS41 (Vaisala-processed data). Above 30 km, T RS41 increasingly exceeds T RS92-GDP.2 , by 0.1 K at 35 km.

555
This indicates differences in the corrections for radiative cooling at the top of the profile for each radiosonde type. The right-hand panel of Figure 10 shows that the temperature differences for daytime measurements in the troposphere are smaller than 0.1 K, with T RS92-GDP.2 larger than T RS41 . Above the tropopause this temperature difference gradually increases with altitude to approximately 0.6 K at 35 km.
560 Figure 11 shows that the tropospheric humidity values in the Vaisala-processed RS41 data are on average up to 5% higher at night and up to 10% higher during daytime. For the daytime measurements (right-hand panel) the relative differences increase with altitude, starting with a mean difference of 2.5% at the surface and reaching approximately 8% at 10 km. This observed higher RH RS41 in the Lindenberg twin soundings is consistent with the results presented in Section 7.2 and 565 in Figure 8.
More detailed and elaborate analyses of the differences between RS92 and RS41, including twin soundings from other (GRUAN) sites, will be performed in subsequent studies. There is already a considerable amount of data available, as is summarized in Tables 1 and 2. These studies will investigate in detail the influence of geographical and climatological effects, such as solar elevation 570 angle, clouds, and winds, on the RS92-RS41 differences.

Scientific database
A dedicated database, containing all data pertaining to the RS92-RS41 transition has been created and will be maintained. This database will be given its own digital object identifier (doi) and be pre- launches made at the sites. Furthermore, it will include coincident ancillary measurements from 580 e.g. satellite overpasses and/or ground-based remote sensors such as those identified in Section 7.
Making ancillary measurements available together with the radiosonde intercomparison data allows for in-depth analysis and understanding of the differences between the RS92 and RS41 radiosondes, and is commensurate with one of the key principles of GRUAN: to have measurement redundancy.
The data format of the files in the ancillary database will be CF-compliant NetCDF for ease 585 of access and the database will be built for easy web-based data discovery and access. It will be available as it is being populated with data to enable scientific analysis from the outset.
Although it may not be possible to analyze all aspects of the data immediately, building a longterm database will enable exploitation by the expert community well into the future and represent a substantial value-added outcome. The database availability will be advertised via the GRUAN 590 website at https://www.gruan.org and readers should check this source for the latest status.

Technical documentation
Documentation is a foundation stone of a reference network such as GRUAN. It is essential for the transfer of knowledge, ranging from describing operational procedures and best practices to perform measurements, via a detailed description of correction algorithms, to documenting changes 595 to measurement systems. In a broader sense, robust documentation ensures the traceability of the data products, a requirement for reference data. Only through the existence of proper documentation is it possible to assure the quality of the measurement data within GRUAN. All GRUAN technical documentation is available on the GRUAN website under https://www.gruan.org/documentation/ gruan/.

600
In the specific case of the RS92 to RS41 transition, comprehensive documentation will provide the required transparency on how the change was managed, and this will make it possible to reconstruct and scrutinize the reported differences between the RS92 and the RS41 radiosondes, even after many years. Furthermore, this documentation will serve as a template for managing any future -Synthesis of the RS92-RS41 intercomparison studies -Comparison against non-radiosonde measurements (e.g. ancillary data) In addition, a final paper will be drafted that collates the results of the separate reports and summarizes and evaluates the outcomes of the RS92-RS41 transition for GRUAN.
9 How to get involved 620 The GRUAN change management programme envisaged in this paper could, in principal, be completed solely by current GRUAN members (sites, Lead Centre, scientists in the working group on GRUAN, and task teams). However, we explicitly recognize that there are substantial resources and expertise beyond the immediate GRUAN community which could increase the robustness of all aspects of the envisaged programme. Some specific potential suggestions are given below but there 625 are undoubtedly many more ways to get involved.
Participation of non-GRUAN sites, who plan to undertake an intercomparison of their own, is strongly encouraged. Sites need not undertake the full multi-season campaign to contribute substantive value. Any additional intercomparison data will provide either additional training data sets or a means to independently validate results and ensure that any geographical effects have been ade-630 quately accounted for. Sites should contact the Lead Centre staff (lead author) to initiate a discussion around data submission requirements.
Participation of experts in the analysis of the results is strongly encouraged. Research results are likely to be more robust and comprehensive after accounting for a broad range of user inputs.
The GRUAN community, although broadly diverse, likely misses some important types of expertise.

635
The GRUAN Lead Centre and Working Group chairs can provide letters of support and further information to investigators wishing to apply for grant support to aid their involvement in the analysis of the transition from the RS92 to other models of radiosonde.
Dissemination and outreach of results leading to impact for both NRT and long-term applications will require sustained community engagement. It is important that the research results translate to 640 real-world applications and that, ultimately, will require user uptake.

Summary and outlook
In this paper we have described the ongoing GRUAN-wide coordinated approach to managing the change from the Vaisala RS92 to the RS41 as an operational radiosonde system within GRUAN.
Since the network's goal is to provide long-term reference-quality observations of ECVs such as