Shell Note

· Read in about 4 min · (813 Words)

Yet another learning note after Perl note and Golang note.

last update: 2016-03-31

loop

rs=$(awk 'BEGIN{for(i=1;i<=10;i+=0.01)print i}')
for r in $rs; do
    echo $r
done

grep

  • grep whole word grep -sw

String

  • string length: var="hello"; len=${#var}
  • replace: prefix=${file/\.fa/}
  • regular expression:

    $ echo -e "chr11\t123\t345\n" |   while read line <&0; do echo $line; re="chr([0-9]+)";  if [[ $line =~ $re ]]; then echo -e "${BASH_REMATCH[1]}\t$line"; fi; done
    chr11 123 345
    11      chr11   123     345
    
  • linux shell 字符串操作(长度,查找,替换)详解

Path/Dir

assign glob values to variable: f=(*.fa). But, only first value is stored.

get script source path

#!/bin/bash
SOURCE="$0"
while [ -h "$SOURCE"  ]; do # resolve $SOURCE until the file is no longer a symlink
    DIR="$( cd -P "$( dirname "$SOURCE"  )" && pwd  )"
    SOURCE="$(readlink "$SOURCE")"
    [[ $SOURCE != /*  ]] && SOURCE="$DIR/$SOURCE" # if $SOURCE was a relative symlink, we need to resolve it relative to the path where the symlink file was located
done
DIR="$( cd -P "$( dirname "$SOURCE"  )" && pwd  )"

delete huge number of files:

mkdir /tmp/emptydir &&  rsync --delete-before -a -H -v --progress --stats /tmp/emptydir   targetdir/

list directories

ls -d */

I/O

join two named pipes into single input stream, more: Using Named Pipes and Process Substitution (anonymous fifo)

cat <(command1) <(command2) > outputfile

read from STDIN

ls crun* | while read f <&0 ; do du -sh $f; done

Find

find ./ -name "*.fa" -exec sh -c ' cat {} | fasta2tab -l | cut -f 3 > {}.len' \;

sed/awk/paste

  • csv2tab: awk -F'^"|","|,"|",|,|"$' '{ out=$1; for(i=2;i<=NF;i++){out=out"\t"$i}; print out}' data.csv
  • delete first line sed 1d file or tail -n +2 file or sed -n '2,$p' or awk 'FNR>1'
  • delete last column awk 'NF-=1' file
  • delete last 5 rows seq 10 | sed 's/\s+/\n/g' | tac | awk 'FNR > 5' | tac
  • merge 2nd column of all tsv file: source <(echo paste $(for f in *.tsv; do echo "<(cut -f 2 $f)"; done | paste -sd" ") )
  • sort file names by number
  • reorder columns: move last column to the 3rd: awk '{$3=$NF OFS $3;$NF=""}7' fileecho -ne “#1.2\n”$(( $(wc -l $f|cut -d “ ” -f 1) - 1) )”\t”$(cat $f | awk -F’\t’ ‘{print NF; exit}’)
$ ls *.xml
12.xml  1.xml  2.xml  3.xml
$ ls *.xml | perl -ne 'chomp; /(\d+)\./; print "$_ $1\n"; ' | sort -k2,2n |  awk '{print $1}' | paste -sd" "
1.xml 2.xml 3.xml 12.xml

default separator

awk  -F  default:whitespace(spaces, TABs, and newlines)
sort -t  default:whitespace(spaces, TABs, and newlines)
cut  -d  default:TAB

out delimiter

awk -v OFS="\t" ....

split file by given value of given column

export file=data.tsv;
cat $file | while read line <&0; \
    do key=$(echo $line | cut -d " " -f 1 ); \
    echo $line >> "${file}_$key"; \
done

Zipped file

  • 7zip for exchange files between windows(GBK) and linux: 7za a -r dir.7z dir/
  • zip with no compression zip -r -0 bootanimation.zip bootanimation`
  • create rar file: rar a archievename -hp{passwd} -r dir/
  • Determine uncompressed size of GZIP file gzip -l compressedfile.gz

http

Using wget to recursively fetch a directory

wget -r --no-parent --reject "index.html*" -e robots=off http://www.example.com/

Performance

parallel --jobs 16 someCommand data{}.fastq '>' output{}.fastq ::: {1..512}

gnu-parallel-note

Misc

  • How to use shell variables in awk script: awk -v var="$variable" 'BEGIN {print var}'
  • calculator, sum: seq 10 | paste -sd+ | bc
  • alter submitted PBS jobs: qalter -lwalltime=15:00:00:00 ID

用screen运行大量后台任务(参考

  1. 创建会话: screen -dmS job1
  2. 查看会话状态: screen -list
  3. 进入会话: screen -r job1
  4. 运行任务,不用转后台。
  5. 暂时断开当前会话:Ctrl + A + D,直到出现如“[detached from 25721.job1]”的文字。
  6. 按照3重新进入会话。
  7. 结束会话:进入会话后像平常一样exit退出即可。

把运行中的程序转入后台

disown 示例1(如果提交命令时已经用“&”将命令放入后台运行,则可以直接使用“disown”)
    [root@pvcent107 build]# cp -r testLargeFile largeFile &
    [1] 4825
    [root@pvcent107 build]# jobs
    [1]+  Running                 cp -i -r testLargeFile largeFile &
    [root@pvcent107 build]# disown -h %1
    [root@pvcent107 build]# ps -ef |grep largeFile
    root      4825   968  1 09:46 pts/4    00:00:00 cp -i -r testLargeFile largeFile
    root      4853   968  0 09:46 pts/4    00:00:00 grep largeFile
    [root@pvcent107 build]# logout

disown 示例2(如果提交命令时未使用“&”将命令放入后台运行,可使用 CTRL-z 和“bg”将其放入后台,再使用“disown”)
    [root@pvcent107 build]# cp -r testLargeFile largeFile2
    [1]+  Stopped                 cp -i -r testLargeFile largeFile2
    [root@pvcent107 build]# bg %1
    [1]+ cp -i -r testLargeFile largeFile2 &
    [root@pvcent107 build]# jobs
    [1]+  Running                 cp -i -r testLargeFile largeFile2 &
    [root@pvcent107 build]# disown -h %1
    [root@pvcent107 build]# ps -ef |grep largeFile2
    root      5790  5577  1 10:04 pts/3    00:00:00 cp -i -r testLargeFile largeFile2
    root      5824  5577  0 10:05 pts/3    00:00:00 grep largeFile2
    [root@pvcent107 build]#

rpm使用

  • 安装 rpm -ivh
  • 重新安装 rpm -ivh --force
  • 更新 rpm -Uvh
  • 删除 rpm -e

md5sum

  • compute: md5sum *.gz > md5sumcheck
  • check: md5sum -c md5sumcheck

vsftpd不支持目录软链接的解决办法

sudo mount --bind /home/shenwei/VirtualBox\ VMs/pool/softs /var/ftp/pub/softs

perl -MCPAN -e 'CPAN::Shell->install(CPAN::Shell->r)'`升级所有perl modules

高效操作Bash,  超实用的Linux/Unix Shell快捷键汇总

如下的快捷方式非常有用,能够极大的提升你的工作效率:

CTRL + U – 剪切光标前的内容
CTRL + K – 剪切光标至行末的内容
CTRL + Y – 粘贴
CTRL + E – 移动光标到行末
CTRL + A – 移动光标到行首
ALT + F – 跳向下一个空格
ALT + B – 跳回上一个空格
ALT + Backspace – 删除前一个单词
CTRL + W – 剪切光标前一个单词
Shift + Insert – 向终端内粘贴文本

图片反色:

parallel convert -negate {} {.}.png ::: *.jpg
convert -negate src.png dst.png