Wednesday, May 28, 2014

Problem of the dangling output log file

LOG ROTATION



Many of you might have faced this issue.Lets say we have a script try.sh and we run it as follows


sh try.sh > output

or consider we have a web server which sends its logs to some log file .

Now everything is fine until the size of the log File is small.
But once it rises you may face disk full issues.
I faced this issue with my webserver.
Now if you try to manually truncate the file it will not work.
Since it has been opened for writing and is currently being written.
Deleting it and creating new file of the same name will not work since the node is attached to the script .

Then how can you rotate the log file and compress it.
Answer is use the linux utility logrotate.

For more details do
man logrotate

It needs a config file to get the log file to be rotated.
add a cron job as follows.

/usr/sbin/logrotate /etc/logrotate.conf


the block u must append in /etc/logrotate.conf

is
/home/script/output {
       daily
       rotate 12
       size 1G
       compress
       missingok
       notifempty
       copytruncate
}


here /home/script/output is the path of the output file to be rotated.

other fields mean the following
run it daily
keep log file of up to 12 days
rotate if size more than 1 GB
compress the log file after rotation.
do not worry if file is missing
notify if the file is empty
create a copy of it , and truncate the original file.


with this the original issue has been solved.
You can now save the disk space and rotate logs.





TCMALLOC and glibc malloc

TCMALLOC

Installation steps:

packages needed


1]    Libunwind
$ tar -zxvf libunwind-1.1.tar.gz
$ cd libunwind-1.1
$ ./configure
$ make
$ make install


2 ] gperftools
$ tar -zxvf  gperftools-2.1.tar.gz
$ cd  gperftools-2.1
$ ./configure
$ make
$ make install




WHY IS IT MORE EFFICIENT ??
A good memory allocator needs to balance a number of goals. Two of these most prominent goals are minimizing time and minimizing space usage. Speed is important for any malloc implementation because if malloc can get faster, thousands of existing applications having bottlenecks on dynamic memory allocation will get significant performance boost without the need to change any code.


Disadvantages of Traditional malloc and an improvement:

  • Lots of wasted space especially for small allocation objects: each Header / Footer occupies 4 bytes (in a 32 bit machine), if coalescing are adopted (without coalescing, only Header is required), every object must be surrounded by Header and Footer, so N- 8 byte objects will account for 16*N bytes.

  • If the size of a free object is bigger than required, but not big enough to carve into smaller objects, then there will be an internal fragmentation.

  • No mechanism to separate small allocations with large (eg. requesting more than 200 MB memory) ones. Cannot adjust the bin size to speed up all kinds of allocations.
  • In a multithreaded application, these data structures need to be protected with locks. As memory is being allocated concurrently in multiple threads, all the threads must wait in line while requests are handled once at a time. Therefore, all threads are competing for access to the same heap causing a problem known as heap contention. Adding a Second Layer (as a thread-cache) on top of the “Base Allocator” can greatly increase the scalability of the allocator .









TCMalloc’s design to counter glibc malloc’s shortages:

  • Memories are allocated by a run of pages instead of arbitrary sizes, thus greatly reduced sbrk or mmap system call overhead.

  • Instead of using Header / Footer, a global page map was used to map between a given page to the location containing info about this page. For 64-bit architecture with 4K pages, 2^52 items have to be mapped. Therefore, a three-level radix tree was used to minimize memory cost, at the beginning, about 4.5 MB are used for the mapping.

  • Separating the small allocation (<= 32KB) with big ones. Also, each thread gets a private thread cache, lock free multi-threaded allocation can be achieved if there is enough space in thread cache.

  • Large object allocations are satisfied by the central page heap, the central heap is NOT thread-safe, so a spinlock has to be taken when allocating from central page heap.

Monday, October 21, 2013

MEMCACHE : FEW THINGS UNKNOWN

MEMCACHE ARCHITECTURE ISSUES


If you start memcache by default it will take some memory and will grow up to the set limit. Lets say you set memory limit 200MB. Memcache will be allocating new memory blocks (slabs) for new items but once it reaches 200Mb it wont be able to allocate any more. Then every new add or set command will evict some older entry. Its all fine as long as your memory limit is high and you are storing relatively small values. If you store values of 100KB
you will run out of space soon and therefore life time of your elements will be shorter.
Memcache never releases memory. So once allocated will remain as memcached property.
Memcache allocates memory in 1MB slabs. Then divides slab in 'chunks' and depending on this division 1MB can be storing items of particular size. For example:
● you store value of 700KB it will create 1MB slab and mark it as slab storing values of size 0.51MB.
● Then if you add value of size 5KB it will not fill out the empty 300KB you have left in previous slab.

The new
item will have to be stored in correct slab. So if there are no free chunks it will allocate 1MB and mark it as
a slab storing items of size 4KB8KB
(just an example)
● you want to set another item 6KB, memcache takes one of the empty chunks in previously allocated slab
(48KB)
Problem is that if you run out of memory now and only have 3 slabs storing items 0.51MB
it meas you can only store 3 such items at once. Obviously if your application needs to store 6 they will be evicting each other and your cache will become very inefficient.



REMOVING OLD ITEMS
Memcached does not have garbage collection so it can not be guaranteed that memcache evicts expired
items. It also does not free memory so you can easily have slabs allocated to some big values that expired days
ago. If there is no need for big items anymore they will sit there forever as memcache expires ite only when you ask
for it.
When you request item with key XYZ it will find it and check its timestamp. If too old, item will be discarded.


DEBUGGING
If you are using memcahced for caching it will be is sometimes necessary to check the
state of the cache. There is no way to dump all keys stored in a memcached server but
using cache dump we can retrieve about a megabyte of data which is often sufficient for
debugging.
Use the stats command to get stats about the different slabs of keys in your server. The
number after "items:" is a slab id and memecached will store your stats in several slabs.
stats items

STAT items:1:number 1
STAT items:1:age 3430476
STAT items:1:evicted 0
STAT items:1:evicted_nonzero 0
STAT items:1:evicted_time 0
STAT items:1:outofmemory 0
STAT items:1:tailrepairs 0
STAT items:1:reclaimed 113
STAT items:2:number 4
STAT items:2:age 555952
STAT items:2:evicted 0
STAT items:2:evicted_nonzero 0
STAT items:2:evicted_time 0
STAT items:2:outofmemory 0
STAT items:2:tailrepairs 0
STAT items:2:reclaimed 12
STAT items:3:number 4
STAT items:3:age 2894457
STAT items:3:evicted 0
STAT items:3:evicted_nonzero 0
STAT items:3:evicted_time 0
STAT items:3:outofmemory 0
STAT items:3:tailrepairs 0
STAT items:3:reclaimed 4
STAT items:4:number 2
STAT items:4:age 3411747
STAT items:4:evicted 0
STAT items:4:evicted_nonzero 0
STAT items:4:evicted_time 0
STAT items:4:outofmemory 0
STAT items:4:tailrepairs 0
STAT items:4:reclaimed 9
STAT items:8:number 18
STAT items:8:age 1330321
STAT items:8:evicted 0
STAT items:8:evicted_nonzero 0
STAT items:8:evicted_time 0
STAT items:8:outofmemory 0
STAT items:8:tailrepairs 0
STAT items:8:reclaimed 1
STAT items:10:number 11
STAT items:10:age 3238392
STAT items:10:evicted 0
STAT items:10:evicted_nonzero 0
STAT items:10:evicted_time 0
STAT items:10:outofmemory 0
STAT items:10:tailrepairs 0
STAT items:10:reclaimed 0
END


To get the keys stored in each slab use the cachedump command. In the command
shown below we are retrieving a maximum of hundred keys from the 4th slab.

stats cachedump 4 100


Wednesday, April 17, 2013

Python - impressed me alot

Recently i tried out python.
i found is extremely easy and beautiful.
I now strongly feel python should be the first language everybody should learn.
Python may not be as fast,  small as c,  but its fun to use.
It helps to concentrate on algorithms rather than syntax and data types.
if people start with python,  more of them will love programming.

Its super crisp.

Hello world program is just

print('hello world ')

Cat command would be

Fin=open('file.txt')

for line in Fin:
    print(line)

Dats it.
Its so easy.

Also the gui programming in python is fast and simple.
I prefer pyqt,  but pygtk,  wxpython would also be useful.

So if u wanna teach programming to kids make sure u start with python.

Saturday, December 25, 2010

ashish adhav

whenever you feel that this is the end, you actually are at another start

Friday, December 10, 2010

life

i do not know why human are given such a  great I.Q
we do not even know what to do with it.