Statistical distributions in julia

This is a quick note on how to generate random variables in Julia and sample from them.

We first needs to load the package Distributions. If it is not already installed, you can do so with:

Pkg.add("Distributions")
using Distributions

All standard distributions are implemented and well documented. I will create a normally distributed random variable with mean 5 and standard deviation 10.
nd = Normal(5, 10)
We apply various functions, e.g. to get the probability density function (pdf)
pdf(nd, 1:5)
cdf(nd, 5)

Finally we would like to take a random sample of size 10
rand(nd, 10)
or fill an Array, a, with a random sample from this distribution
a = Array(Float64, 10^5, 2);
rand!(nd, a);

.

Posted in julia | Tagged | Leave a comment

Setting up emacs org-mode babel for R on Ubuntu

I installed org-mode seperately, since I had troubles with its default setting (similar to the problems described here).

sudo apt-get install org-mode

Next I download and installed ESS

cd ~/.emacs.d/
wget http://ess.r-project.org/downloads/ess/ess-12.09-1.zip
unzip ess-12.09-1.zip
rm ess-12.09-1.zip

Finally I had to add the following lines to my .emacs file:

(add-to-list 'load-path "~/.emacs.d/ess-12.09-1/lisp")
(require 'ess-site)
(org-babel-do-load-languages
'org-babel-load-languages
'((R . t)))

and to my .Rprofile

old options(defaultPackages = c(old, "tikzDevice"))

For testing open a new file with emacs:

emacs test.org

and add the following code block:

#+begin_src R
date()
#+end_src

press C-c C-c to execute the code. See the org-mode documentation for more information.

Posted in Computer, R, Ubuntu | 1 Comment

Vim-R-plugin: Installation

On Ubuntu 12.04, I took to the following steps to install Vim-R-plugin:

  1. First I had to install timux: sudo apt-get installl tmux
  2. Having already installed vim-pathogen, I used git to clone to install vim-r-plugin and vim-screen plugin, which is also required: cd ~/.vim/bundle
    git clone git://github.com/vim-scripts/Vim-R-plugin
    git clone git://github.com/vim-scripts/Screen-vim---gnu-screentmux
  3. Then I changed my localleader to “,”, by adding: let maplocalleader = ",", to my .vimrc file.
  4. Finally I added the following lines to my .Rprofile:

    if(interactive()){
    library(colorout)
    library(setwidth)
    library(vimcom)
    }

That it, everything works perfectly.

Posted in BASH, R, vim | 8 Comments

Reproducible research with markdown, knitr and pandoc

Over the last few weeks I was trying to optimise my workflow using markdow in combination with knitr and pandoc. Knitr is a grea new package by Yihui, expanding R’s capabilities for reproducible research.

I will illustrate my work flow with the following example, where I have a small R-script (script.r) that I want to embed into a report. However, I do not want to write LaTeX, nor I want/can specify my final output format in the beginning. That is where were pandoc comes in. Pandoc is the swiss-army knife if come to convert between markup languages.

The file script.r:


## @knitr gen-dat
a <- matrix(rnorm(100), nrow=10)

## @knitr plot
 image(a)

The report is written with  pandoc flavored markdown. The file (report_knit_.md) contains


% A sample report
% The author
% `r date()`

<!-- Setting up R -->
`ro warning=FALSE, dev="png", fig.cap="", cache=FALSE or`

<!-- read external r code -->
```{r reading, echo=FALSE}
read_chunk("script.r")
```

# The first part of my R script
Here I can generate my data
```{r}
<<gen-dat>>
```

# Results
An now the reults are plotted
```{r plot-fig, result="asis"}
<<plot>>
```

# More
Of course I can use inline elemtnts: 3 + 3 = `r 3+3`.

For each chunk there are plenty of options to modify it (see options).

To render my report, I need to first knit it in R and then use pandoc to convert it to the final format. This can be done with


Rscript -e "library(knitr); knit('report_knit_.md')"

This results in a pandoc flavored markdown document. Now I can use pandoc to convert this document into all by pandoc supported output format (list of formats):

  • A pdf file: pandoc -s report.md -t latex -o report.pdf
  • A html file: pandoc -s report.md -o report.html (with the -c flag html files can be added easily)
  • Openoffice: pandoc report.md -o report.odt
  • Word docx: pandoc report.md -o report.docx

Files are available on github.

Posted in Computer, R, Uncategorized | Tagged , , | 11 Comments

lubridate: working with date and time in R

Working with date and time in R is sometimes tricky. Today I gave lubridate a try and was surprised on how easy it can be. Lubridate is a available on git and on CRAN. There is also a good introduction published in the Journal of Statistical Software.


install.packages("lubridate")
library(lubridate)
# Create an object
bday <- dmy("23121984")

This could also have been achieved with any combination of d(ay)m(onth)y(ear), i.e. ymd() or dym().

Several options are provided to work with the bday object:

wday(bday)  # day of the week
wday(bday, label=T)  # day of the week, abreviated
yday(bday)  # day of the year

lubridate also makes it easy to calculate with dates

wday(bday + years(1), label=T)  # day of week one year later

table(sapply(1:100, function(x) wday(bday + years(x), label=T)))  # days of the week for next 100 years.
Posted in R, Uncategorized | 1 Comment

Add areas to a vector in GRASS

I sometimes use r.to.vect to convert a raster map in GRASS to a vector map. One way to find the area of each of the new polygons is:

# I am working with the vector map forests
v.add.col map=forests column="area double precision"
v.to.db map=forests option=area units=h columns=area

This will add the column area to forests and updates it with the size of the polygon in hectars.

Posted in Computer, GRASS | Leave a comment

Check ports

To check which ports are open on a server


nmap <ip of the server>

To check on the other hand which ports a server is listening to (before the firewall) run on the server


netstat -tulpn

where -t (tcp) -u (udp) -l (show listening sockets only) -p (show process id) -n (show numerical addresses)

Posted in Uncategorized | Leave a comment

Kill all processes of a user

To list all processes of myuser:


# list all processes of a user

ps -fu myuser

# grep pid

ps -fu myuser | awk 'NR !=1 {print $2}'

# kill them all

kill -9 $(ps -fu myuser | awk 'NR != 1 {print $2}')

Posted in BASH, Ubuntu | Leave a comment

Rescale ranges

Lets say I have values in a range from 0-1 and I want to rescale them to a range of 1 to 25. Generally speaking this can be resolved by: n.min + (x – o.min) * (n.max – n.min)/(o.max – o.min), where x is the value that is rescaled and n and o stand for new and old. For example to rescale the value 0.5 on our old scale ranging from 0 to 1 to our new scale we use:
1+(0.5-0)*(25-1)/(1-0) = 13

Posted in Uncategorized | Leave a comment

Read zipped file into R

Sometimes I do not want to unzip files before reading them to R. There is a nice way of reading zipped file (via a tmp dir) into R.

 myfile <- read.csv(zip.file.extract("~/files/test.csv", "myzip.zip"))
 

Where the file test.csv is actually located in the: ~/files/myzip.zip/test.csv.

Posted in R, Uncategorized | Tagged | Leave a comment