Statistical distributions in julia

This is a quick note on how to generate random variables in Julia and sample from them.

We first needs to load the package Distributions. If it is not already installed, you can do so with:
``` Pkg.add("Distributions") using Distributions ```
All standard distributions are implemented and well documented. I will create a normally distributed random variable with mean 5 and standard deviation 10.
`nd = Normal(5, 10)`
We apply various functions, e.g. to get the probability density function (pdf)
```pdf(nd, 1:5) cdf(nd, 5)```
Finally we would like to take a random sample of size 10
`rand(nd, 10)`
or fill an Array, a, with a random sample from this distribution
```a = Array(Float64, 10^5, 2); rand!(nd, a);```

.

Setting up emacs org-mode babel for R on Ubuntu

I installed org-mode seperately, since I had troubles with its default setting (similar to the problems described here).
``` sudo apt-get install org-mode ```
``` cd ~/.emacs.d/ wget http://ess.r-project.org/downloads/ess/ess-12.09-1.zip unzip ess-12.09-1.zip rm ess-12.09-1.zip ```
Finally I had to add the following lines to my `.emacs` file:
``` (add-to-list 'load-path "~/.emacs.d/ess-12.09-1/lisp") (require 'ess-site) (org-babel-do-load-languages 'org-babel-load-languages '((R . t))) ```
and to my `.Rprofile`
``` old options(defaultPackages = c(old, "tikzDevice")) ```
For testing open a new file with emacs:
``` emacs test.org ```
and add the following code block:
``` #+begin_src R date() #+end_src ```
press `C-c C-c` to execute the code. See the org-mode documentation for more information.

Posted in Computer, R, Ubuntu | 1 Comment

Vim-R-plugin: Installation

On Ubuntu 12.04, I took to the following steps to install Vim-R-plugin:

1. First I had to install timux: `sudo apt-get installl tmux`
2. Having already installed vim-pathogen, I used git to clone to install vim-r-plugin and vim-screen plugin, which is also required: `cd ~/.vim/bundle`
``` git clone git://github.com/vim-scripts/Vim-R-plugin git clone git://github.com/vim-scripts/Screen-vim---gnu-screentmux ```
3. Then I changed my localleader to “,”, by adding: `let maplocalleader = ",", `to my .vimrc file.
4. Finally I added the following lines to my .Rprofile:
``` if(interactive()){ library(colorout) library(setwidth) library(vimcom) } ```

That it, everything works perfectly.

Posted in BASH, R, vim | 8 Comments

Reproducible research with markdown, knitr and pandoc

Over the last few weeks I was trying to optimise my workflow using markdow in combination with knitr and pandoc. Knitr is a grea new package by Yihui, expanding R’s capabilities for reproducible research.

I will illustrate my work flow with the following example, where I have a small R-script (script.r) that I want to embed into a report. However, I do not want to write LaTeX, nor I want/can specify my final output format in the beginning. That is where were pandoc comes in. Pandoc is the swiss-army knife if come to convert between markup languages.

The file script.r:

```
## @knitr gen-dat
a <- matrix(rnorm(100), nrow=10)

## @knitr plot
image(a)

```

The report is written with  pandoc flavored markdown. The file (report_knit_.md) contains

```
% A sample report
% The author
% `r date()`

<!-- Setting up R -->
`ro warning=FALSE, dev="png", fig.cap="", cache=FALSE or`

<!-- read external r code -->
```

# The first part of my R script
Here I can generate my data
```{r}
<<gen-dat>>
```

# Results
An now the reults are plotted
```{r plot-fig, result="asis"}
<<plot>>
```

# More
Of course I can use inline elemtnts: 3 + 3 = `r 3+3`.

```

For each chunk there are plenty of options to modify it (see options).

To render my report, I need to first knit it in R and then use pandoc to convert it to the final format. This can be done with

```
Rscript -e "library(knitr); knit('report_knit_.md')"

```

This results in a pandoc flavored markdown document. Now I can use pandoc to convert this document into all by pandoc supported output format (list of formats):

• A pdf file: pandoc -s report.md -t latex -o report.pdf
• A html file: pandoc -s report.md -o report.html (with the -c flag html files can be added easily)
• Openoffice: pandoc report.md -o report.odt
• Word docx: pandoc report.md -o report.docx

Files are available on github.

Posted in Computer, R, Uncategorized | Tagged , , | 11 Comments

lubridate: working with date and time in R

Working with date and time in R is sometimes tricky. Today I gave lubridate a try and was surprised on how easy it can be. Lubridate is a available on git and on CRAN. There is also a good introduction published in the Journal of Statistical Software.

```
install.packages("lubridate")
library(lubridate)
```
```# Create an object
bday <- dmy("23121984")
```

This could also have been achieved with any combination of d(ay)m(onth)y(ear), i.e. ymd() or dym().

Several options are provided to work with the bday object:

```wday(bday)  # day of the week
wday(bday, label=T)  # day of the week, abreviated
yday(bday)  # day of the year
```

lubridate also makes it easy to calculate with dates

```wday(bday + years(1), label=T)  # day of week one year later

table(sapply(1:100, function(x) wday(bday + years(x), label=T)))  # days of the week for next 100 years.
```
Posted in R, Uncategorized | 1 Comment

Add areas to a vector in GRASS

I sometimes use r.to.vect to convert a raster map in GRASS to a vector map. One way to find the area of each of the new polygons is:

```# I am working with the vector map forests
v.to.db map=forests option=area units=h columns=area
```

This will add the column area to forests and updates it with the size of the polygon in hectars.

Check ports

To check which ports are open on a server

```
nmap <ip of the server>

```

To check on the other hand which ports a server is listening to (before the firewall) run on the server

```
netstat -tulpn

```

where -t (tcp) -u (udp) -l (show listening sockets only) -p (show process id) -n (show numerical addresses)

Kill all processes of a user

To list all processes of myuser:

```
# list all processes of a user

ps -fu myuser

# grep pid

ps -fu myuser | awk 'NR !=1 {print \$2}'

# kill them all

kill -9 \$(ps -fu myuser | awk 'NR != 1 {print \$2}')

```

Rescale ranges

Lets say I have values in a range from 0-1 and I want to rescale them to a range of 1 to 25. Generally speaking this can be resolved by: n.min + (x – o.min) * (n.max – n.min)/(o.max – o.min), where x is the value that is rescaled and n and o stand for new and old. For example to rescale the value 0.5 on our old scale ranging from 0 to 1 to our new scale we use:
1+(0.5-0)*(25-1)/(1-0) = 13

``` myfile <- read.csv(zip.file.extract("~/files/test.csv", "myzip.zip"))