There I learned to operate statistics packages, such as SPSS, BMDP, and S+, under the direction of the statisticians.
We did the statistical analysis of things like drug effectiveness, radiation sensitivity, toxicology, epidemiology, and so forth. We also vetted the statistical models of proposed research projects, under the umbrella of the ethics boards and the research oversight committees. (But that's a different story.)
Anyway, one of our well-funded pay-the-bills study was to investigate a particular medical device, an ultrasound based scanner that the sponsor was hoping could be used for the early detection of malignant prostate cancer.
The datasets we analyzed came from the scanner, from interpreted micrographs of biopsies, and from physicians doing what was delicately referred to as "DRE", aka "Digital Rectal Exam".
One day, I was reading some of the research documentation, and found a clause in the physicians participation agreement that referred types of published analysis that would not be done with the datasets. There was the obvious stuff about personally identifying information, privacy of participants, and so forth.
And then there was a class of verboten disclosure that was described in complex terms, way over my head.
I asked one of the researchers about it, and so she smiled, and then sketched out an analysis for me to run, and then to bring her a printout when it was done.
A short while later, I handed it to her, and she interpreted it to me.
On average, the DRE turned out to be almost a waste of time. And what was worse, for a small but significant number of physicians, they were literally worse than random, at a statistically significant level. That is, if they just reversed all of their diagnosises, they would have done a better job.
It was an amazing and important finding. And it was forbidden by the physican participation agreement for us to officially discover or publish.
The researcher took the printout, and dropped it into the garbage. And we went back to work.