Nov 022012
R and loops don’t get along. It’s usually easier and faster to vectorize your problem.
The apply function allows you to perform calculations across an entire matrix, extremely rapidly. It’s also easier to write than a for loop.
I have some data tables I entered using read.table.
> head(data) X3dpfE X3dpfS brain1 brain2 brain3 heart1 heart2 heart3 liver1 liver2 liver3 ENSDARG00000000001 1.562704 1.302656 3.386386 4.501326 3.770699 4.410918 3.119245 4.981105 4.543457 4.660851 4.166984 ENSDARG00000000002 1.868292 2.669400 3.774026 2.148060 3.342408 6.346923 5.704955 6.396921 3.177544 2.848264 3.031184 ENSDARG00000000018 5.537198 6.120828 6.227827 5.199024 6.745472 7.235136 4.266326 7.304808 6.080495 5.901095 6.149637 ENSDARG00000000019 8.307379 8.347306 7.602548 7.846188 7.367002 5.948792 6.926612 5.624044 7.266805 6.870143 7.184076 ENSDARG00000000068 4.499482 3.008155 3.887356 3.989466 4.078162 4.942363 4.140618 5.336359 5.254668 4.807248 4.800853 ENSDARG00000000069 7.198090 6.767412 3.703204 4.509300 3.804258 5.377648 5.205194 5.073296 7.843168 8.043932 7.857601
I want to get the mean between specific columns. This is how you use the apply function to quickly do this.
> meanBrain13 = apply(data[c(3,5)],1,mean) > head(meanBrain13) ENSDARG00000000001 ENSDARG00000000002 ENSDARG00000000018 ENSDARG00000000019 ENSDARG00000000068 ENSDARG00000000069 3.578543 3.558217 6.486649 7.484775 3.982759 3.753731
The “1” before the mean tells apply to use the mean function between the COLUMNS. If you use “2”, then it takes the mean of each individual column.
> meanBrain13 = apply(data[c(3,5)],2,mean) brain1 brain3 2.931025 2.909532
To write this data in a tab delimited table format without any quotes, use write.table.
> write.table(meanBrain13,file="meanBrain13.cqn.rpmk.dat",sep="\t", quote=FALSE)