What does ${0##*/} mean in shell scripting?

no comments

This looks a bit cryptic but is an example of dollar sign expansions. In shell scripting, the dollar sign can do more than just variable substitution. It can do string substitution as well as mimic commands such  wc and sed.

Here $0 is the name of the current running script, such as script.sh or /usr/local/bin/script.sh depending on how it was called.

## means to do largest matching substring removal and */ is the regex pattern of the substring to be removed.  When combined ##*/ means to truncate $0 to the first /, inclusive /.  So /usr/local/bin/script.sh becomes script.sh. This is just like the basename command except that it’s faster and the shell does not need to create a new process.

SUSE 11 telnetd quirk

no comments

I was trying to write a Python script to query some data from a machine over telnet. For testing, I used the telnetd server program that came with SUSE 11. For some reason, the example code for telnetlib refused to work. After much troubleshooting, I discovered that it was halting behind the scenes on this prompt:

Last login: Mon May 17 15:22:09 SGT 2010 from localhost on pts/4
tset: unknown terminal type network
Terminal type?

The script could login but the tset utility did not know how to initialize the terminal of type “network” and was now asking which terminal to use.  The terminfo entry for terminal of type “network” was undefined. This problem seems to be specific to SUSE servers.  I was able to solve this getting my script to reply to the prompt with an alternative terminal, such as vt100.

Setting up port forwarding with iptables

no comments

This is a quick note on setting up port forwarding with iptables.

Port forwarding is the process of changing the destination address (and/or port) of ip packets so that they can reach a host residing behind a firewall or NAT device.

Under the hood, the port forwarding host changes the destination address of incoming packets to point to the target host on the internal network. It also changes the origin address of the packets from that of the origin host to it’s own. This is needed so that the target host replies to the port forwarding host rather that replying to directly to the origin host.

The way to do this in Linux is with the iptables firewall rules. Here’s how:

First you need to enable ip forwarding.

  echo 1 > /proc/sys/net/ipv4/ip_forward  # to make this permanent, edit sysctl.conf

Then you need to edit the PREROUTING and POSTROUTING chains in the NAT table. The NAT table is consulted when a packet that creates a new connection is encountered.

iptables -t nat -A PREROUTING -p tcp -i eth0 -d 192.168.16.62  --dport 80 -j DNAT --to-destination  10.0.0.90:80

Here the port forwarding host has eth0 with address as 192.168.16.62 and eth1 at 10.0.0.94 .   eth0 is the “external” network  and  eth1 is on the “internal” network. 10.0.0.90 is the address of the host that we want to forward packets to. The target of this iptables rule is DNAT (-j DNAT).  This target is used to do Destination Network Address Translation, which means that it is used to rewrite the Destination IP address of a packet. If a packet matches a iptables rule and DNAT is the target, all subsequent packets are translated and routed to the specified address. The iptables rule above will  take any incoming packets with the destination address 192.168.16.62 at port 80, change the destination address to 10.0.0.90 and then send them for routing (which is why ip_forwarding needs to be enabled).

Another iptables rule is needed to change the source address of the forwarded packets such that return packets from the target host (10.0.0.90) will be routed to the port forwarding host instead of directly to the origin host (it would fail).

iptables -t nat -A POSTROUTING -d  10.0.0.90 -j SNAT --to 10.0.0.94

Aliases ip addresses

no comments

Aliases ip addresses can be useful in some situations. Here’s how to add them to your ethernet interface…

You can use the ifconfig <alias-interface> command :

$ ifconfig eth0:1 10.10.2.13 netmask 255.255.255.0
 
$ ifconfig
eth0      Link encap:Ethernet  HWaddr 00:1d:7d:12:5e:2a
          inet addr:192.168.1.1  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::21d:7dff:fe12:5e2a/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:136116 errors:0 dropped:0 overruns:0 frame:0
          TX packets:103309 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:167184450 (167.1 MB)  TX bytes:13659079 (13.6 MB)
          Interrupt:27 Base address:0x8000 
 
eth0:1    Link encap:Ethernet  HWaddr 00:1d:7d:12:5e:2a
          inet addr:10.10.10.1  Bcast:10.10.10.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Interrupt:27 Base address:0x8000

To delete the alias ip address, do a ifconfig <alias-interface> down.

$ ifconfig eth0:1 down

LVM snapshots

no comments

A LVM snapshot is a special logical volume(LV) that when created is a copy of another LV at that point in time. A sysadmin can use this facility to make quick backups of LVs even if they are mounted and being used by a running system.  This does require that the snapshot be made when the data on the LV is a consistent state. Thanks to the VFS-lock patch in LVM1, many filesystems (including ext3) can allow consistent snapshots to be taken at anytime. However do check if your filesystem allows this or require extra steps before a consistent snapshot can be taken. The XFS filesystem for example maybe require the sysadmin to do a xfs_freeze to pause the filesystem first.

Here’s some things that I’ve observed about snap shot volumes

1) Snapshot volumes consumes free LVM extents.

# vgdisplay vg1
  --- Volume group ---
  VG Name               vg1
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  2
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                1
  Open LV               1
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               700.00 MB
  PE Size               4.00 MB
  Total PE              175
  Alloc PE / Size       125 / 500.00 MB
  Free  PE / Size       50 / 200.00 MB
  VG UUID               IE77Fv-Lr7z-2H8c-JsMR-fPNG-jx4s-wLNtck
 
 # lvcreate -s -n snaptest1 /dev/vg1/test_lv -L 100M
  Logical volume "snaptest1" created
 
 # vgdisplay vg1
  --- Volume group ---
  VG Name               vg1
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  16
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               1
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               700.00 MB
  PE Size               4.00 MB
  Total PE              175
  Alloc PE / Size       150 / 600.00 MB
  Free  PE / Size       25 / 100.00 MB
  VG UUID               IE77Fv-Lr7z-2H8c-JsMR-fPNG-jx4s-wLNtck

Here I am creating a 100M snapshot named “snapshot1” on test_lv (500M). Both of them reside on the volume group “vg1“.  snapshot1 is a 100MB snapshot1 and so requires 100M from vg1. There does not seems to be anyway to specify another volume group for this which implies that a snapshot must be created in the same volume group as the LV that it is based on.

2) LVM2 snapshot volumes can be mounted read/write.

It will appear to be the same as the target volume at the point in time it was created.

 # mount /dev/vg1/snaptest1 /mnt/test2
 
 # ls /mnt/test*
/mnt/test:
lost+found  testfile.txt
 
/mnt/test2:
lost+found  testfile.txt
 
 # df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg1-test_lv
                      485M   17M  443M   4% /mnt/test
/dev/mapper/vg1-snaptest1
                      485M   17M  443M   4% /mnt/test2
 
#  cat /mnt/test/testfile.txt
this is a test file
 
#  cat /mnt/test2/testfile.txt
this is a test file

Writing (reading too but only a tiny amount) to a snapshot volume will consume snapshot disk space, see next point. Note that LVM1 snapshot volumes are read-only.

3) Block level changes consume snapshot space

# lvs
  LV        VG   Attr   LSize   Origin  Snap%  Move Log Copy%  Convert
  opt_lv    vg0  -wi-ao   1.00G
  root_lv   vg0  -wi-ao   6.00G
  swap_lv   vg0  -wi-ao   1.00G
  snaptest1 vg1  swi-a- 100.00M test_lv   0.01
  test_lv   vg1  owi-ao 500.00M
 
# dd if=/dev/zero of=/mnt/test/test1.img bs=1M count=20
20+0 records in
20+0 records out
20971520 bytes (21 MB) copied, 0.0414562 s, 506 MB/s
 
 # lvs
  LV        VG   Attr   LSize   Origin  Snap%  Move Log Copy%  Convert
  opt_lv    vg0  -wi-ao   1.00G
  root_lv   vg0  -wi-ao   6.00G
  swap_lv   vg0  -wi-ao   1.00G
  snaptest1 vg1  swi-a- 100.00M test_lv  20.19
  test_lv   vg1  owi-ao 500.00M

Creating a 20MB file on test_lv uses up 20% of snapshot1’s extents (see the Snap% column).  This works up to 20MB.

Writing to the snapshot volume also uses up snapshot space.

# dd if=/dev/zero of=/mnt/test2/test1_2.img bs=1M count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.0216117 s, 485 MB/s
 
 # lvs
  LV        VG   Attr   LSize   Origin  Snap%  Move Log Copy%  Convert
  opt_lv    vg0  -wi-ao   1.00G
  root_lv   vg0  -wi-ao   6.00G
  swap_lv   vg0  -wi-ao   1.00G
  snaptest1 vg1  swi-ao 100.00M test_lv  30.27
  test_lv   vg1  owi-ao 500.00M

Here creating a 10M file on the mounted snapshot volume used up 10MB worth of snapshot space.

However deletion only uses a tiny bit of snapshot space.

#  rm /mnt/test/test1.img
# lvs
  LV        VG   Attr   LSize   Origin  Snap%  Move Log Copy%  Convert
  opt_lv    vg0  -wi-ao   1.00G
  root_lv   vg0  -wi-ao   6.00G
  swap_lv   vg0  -wi-ao   1.00G
  snaptest1 vg1  swi-ao 100.00M test_lv  30.29
  test_lv   vg1  owi-ao 500.00M

Snapshot LVs store changes at the block level only.  Creating a 20MB file will result in 20MB worth of block level changes and hence require 20MB of snapshot space . A deletion only involves the wiping of a file inode which changes only a few blocks.

Interestingly, other seemingly non-modifying operations such as mounting/unmounting, file listing with the ls command will also consume a bit of snapshot space each time they are performed. These operations modify the internal filesystem structures invisibly to the user.

4) Once a snapshot LV is full, it is no longer usable.

# dd if=/dev/zero of=/mnt/test/test2.img bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 0.217156 s, 483 MB/s
 
# lvs
/dev/dm-4: read failed after 0 of 4096 at 0: Input/output error
LV        VG   Attr   LSize   Origin  Snap%  Move Log Copy%  Convert
opt_lv    vg0  -wi-ao   1.00G
root_lv   vg0  -wi-ao   6.00G
swap_lv   vg0  -wi-ao   1.00G
snaptest1 vg1  Swi-Io 100.00M test_lv 100.00
test_lv   vg1  owi-ao 500.00M

Here after creating a 100MB file on test_lv, snapshot1 is now 100% full (see the Snap% column). The snapshot LV is now not usable and a new snapshot should be created to replace it.

What does it mean for a snapshot to be unusable?

If it is still mounted you might still be able to read from it.

# cat /mnt/test2/testfile.txt
this is a test file

But you won’t be able to write to it

# touch  /mnt/test2/testfile2.txt
touch: cannot touch `/mnt/test2/testfile2.txt': Read-only file system

And if you won’t be able to mount it again.

 # mount -t ext3 /dev/vg1/snaptest1 /mnt/test2
mount: wrong fs type, bad option, bad superblock on /dev/mapper/vg1-snaptest1,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so

Uses of snapshots

Snapshot volumes are great for making quick copies of logical volumes for backups or experiments. For example, you can make nightly backups of the home directories of your users so that when they lose important data, you can restore them easily. Another use might be to make a quick copy of your virtual machines’ file backends for experimentation.

Performance

Active snapshot volumes do incur extra I/O operations and may cause performance degradation. This is something to be aware of.

The uses of __slots__ in Python

no comments

Python classes are implemented using a dictionary object to hold class attributes. Each class attribute reference is translated into a lookup on this dictionary. This dictionary is named __dict__.

>>> class classA(object):
...    def __init__(self):
...     self.x=1
...     self.y=2
 
>>> objA = classA()
 
>>> objA.__dict__
      {'y': 2, 'x': 1}

This is ok for most cases but wastes memory if you creating many (say thousands) python objects because each object instance will have a dictionary object in it. To solve this, the __slot__ class declaration was introduced in Python 2.2.  The __slot__ declaration takes a sequence of instance variables and reserves just enough memory for them.

>>> class classA(object):
...  __slots__=('x','y')
...  def __init__(self):
...   self.x=1
...   self.y=2
...
 
>>> objA = classA()
 
>>> objA.__slots__
...    ('x', 'y')

The __slots__ declaration also has another use beside memory optimization. The dictionary based Python objects are highly flexible objects. One can add new instance variable to them on the run:

>>> objA.z=3
 
>>> objA.__dict__
{'y': 2, 'x': 1, 'z': 3}

This is both powerful and highly flexible. But it can also be a source of subtle errors e.g. typos cannot be easily detected, variables with the wrong spelling simply gets added to the object and become new instance variables, objects can have obsolete or useless attributes added to them over time etc. To disable this behavior, you can use __slot__ based objects because __slot_ based objects do not allow attributes to be added on the fly:

>>> objB.z=3
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'classB' object has no attribute 'z'

Non-relational databases links

no comments

There’s been an increased interest in non-relational database systems in recent years as more people realize that sometimes a RDBMS might not be the best solution for all problem domains. For example, it may make sense to trade off the consistency guarantees of the relational model for easier scalability or faster performance. Or it may be difficult to fit your data ,such as geographical information, into a relational schema. Or it may be easier to not require the use of SQL for your developers which can help to reduce the impedance matching between their development language’s data model (e.g object oriented) and the relational model.

Here are some links that I’ve stumbled across:

SCALE 8x: Relational vs. non-relational

http://lwn.net/Articles/376626/

Visual Guide to NoSQL Systems

http://blog.nahurst.com/visual-guide-to-nosql-systems

The End of a DBMS Era (Might be Upon Us)

http://cacm.acm.org/blogs/blog-cacm/32212-the-end-of-a-dbms-era-might-be-upon-us/fulltext

DBMSs for Science Applications: A Possible Solution

http://cacm.acm.org/blogs/blog-cacm/22489-dbmss-for-science-applications-a-possible-solution/fulltext