Parental control on home network

I recently looked into ways to block content on the home network. To protect the entire network, it seems like the filter should be placed on the router. This article on Lifehacker lists a few popular methods. I think using OpenDNS to filter is easy enough to get started. However, it’s quite easy to configure your connected computer to use a different DNS provider. However, one could set a static DNS on their tomato router.

Output data to Excel for reproducible post-hoc analysis or visualization

As much as I like to analyze and visualize data in R, I sometimes have the need to export results/data into Excel for my business partners or myself to consume in Excel or Powerpoint (eg, create custom/edit-able bar charts with various graphical overlays in a powerpoint slide). As I’ve been using the XLConnect package to read xls/xlsx files, I’m also using it to write data to a sheet in an Excel workbook. I write the necessary data out to the Data sheet:

library(XLConnect)
## read data from files/DB and manipulate to get to the final data set
## now, write data
writeWorksheetToFile(file='foo.xlsx', data=iris, sheet='Data', clearSheets=TRUE) ## in case number of rows is less than before

I ask my collaborators to not edit the Data sheet except adding filters or sorting when they are inspecting/eye-balling the data. I ask them to do all analysis in separate sheets. The reason for this is to keep the work reproducible in case the data needs to be refreshed (error in data, repeat the analysis on new data, etc). That is, when I need to refresh the data, I ask for the modified Excel workbook and write out refreshed data to the same sheet (hence the clearSheets=TRUE option in case the number of rows is less than before). That way, calculations or plots referencing columns in the Data sheet would automatically be refreshed in the workbook with the refreshed data.

This is just another way to prevent inefficiency in the work flow and allows for reproducibility even when collaborating with Excel workbooks.

This work flow should in theory also work with SAS:

proc export data=mydata outfile='foo.xlsx'
    dbms=xlsx ;
    sheet=Data ;
run ;

First hackintosh with Windows dual boot using Intel NUC DC3217BY

wpid-2014-02-07-multibeast.png

My friend recently introduced me to the Intel NUC (DC3217BY). It’s basically a micro form factor barebone system that comes with Intel’s ULV i3 processor (powerful and low power consumpton). I decided to get one, slapped on 8 GB of ram, a 256 GB mSATA SSD, and a Broadcom based half-sized e-PCI network card, and Hackintosh’d it since the processor is similar to what’s in an Apple Macbook Air. Basic instructions for this particular hardware could be found here and here. A generic guide could be found here. For dual booting with Windows, this article and this post helped. This is what I recalled doing to set up:

  • Update the BIOS to the latest version
  • Create a bootable Windows 7 usb drive on Ubuntu using unetbootin (must be version 494) (drive must be formatted to NTFS)
  • Get Mac OS X 10.9.1 (Mavericks) from the Apple App store
  • Download Unibeast and Multibeast at tonymacx86
  • Download Chameleon Wizard
  • Download Kext Installer
  • On an existing machine with Mac OS X, /Applications/Utilities/Disk Utility and a (> 8 GB) usb flash drive to Mac OS Extended (Master Boot Record enabled).
  • Run Unibeast to load the Mac OS X installer on it
  • Copy Multibeast, Chameleon Wizard, and Kext Installer into this flash drive. Download DSDT.aml and the patched AppleIntelFramebufferCapri.kext here and place them on the flash drive as well (these are to get HDMI audio to work).
  • Boot up the flash drive, and boot the installer with the flags -x PCIRootUID=1 GraphicsEnabler=Yes per this post relevant to Mavericks
  • Once the Mac installer is booted, go to the Utilities Menu and launch Disk Utility. Format the hard drive into two partitions. The first should be called “Macintosh HD” and formatted to Mac OS Extended (Journaled) and the second should be called “BOOTCAMP” and formatted to MS-DOS (FAT).
  • Shutdown, insert in the Windows 7 usb, and install Windows on the second partition
  • Boot the Mac usb again, and install Mac OS X on the first partition
  • Boot the Mac usb again, and select to boot into the Mac OS X partition
  • Run Multibeast to do some post-configurations so that the hardware just works (options in image below)
  • Edit the org.chameleon.Boot via Chameleon Wizard per the image below
  • Copy DSDT.aml to /Extra/DSDT.aml
  • Install Kext Installer. Use it to install the patched AppleIntelFramebufferCapri.kext, then use it to rebuild permissions and kext cache. Restart the computer to have HDMI audio working.

Multibeast options: 2014-02-07-multibeast.png

Chameleon Wizard options for org.chameleon.Boot: 2014-02-07-chameleon-wizard.png

What would I do differently now? Consider getting a network card with bluetooth like the Dell DW1702 per this. I’m not sure if my monitor has speakers, so this would enable me to use wireless speakers.

Now, time to mount the small NUC to the back of my 27″ monitor.

Bulk resize images and keeping original files

Suppose you have a directory (with subdirectories) full of images, and you want to resize them all while keeping the original images. To do so, first create a copy of the directory tree without the image files. Then, using a for loop, find each image file and apply the convert command to it. The following is an example to resize jpg files to 40% of the original quality.

mkdir /path/to/mirror_dir
find /path/to/image_dir -type d -exec mkdir -p /path/to/mirror_dir/{} \;
cd /path/to/image_dir
for i in $(find -iname "*.jpg"); do echo $i; convert -resize 40% $i /path/to/mirror_dir/$i; done

Skeleton to create fast automatic tree diagrams using R and Graphviz

wpid-2014-01-17-tree_diagram.png

I’ve had to create tree diagrams (dendrograms, decision trees) many times in the past to illustrate the flow of data or decisions (e.g., data flow for a study). This is usually a manual task done in MS Powerpoint or Visio. I’ve also made some diagrams in the past using Graphviz based on the DOT language to make creation more reproducible. However, that still felt pretty manual.

I decided to come up with a skeleton framework to generate these diagrams using R since I could connect to various data sources, do calculations, and mash up outputs fairly fast with it.

Here is my framework illustrated by an example:

digraph <- '# dot -Tpng diagram.gv > diagram.png
digraph g {
graph [rankdir="LR"]
node [shape="rectangle" style=filled color=blue fontcolor=white] ;
n [label="%s"] ;
n_a [label="%s" color=red] ;
n_b [label="%s"] ;
n_aa [label="%s" color=red] ;
n_ab [label="%s" color=red] ;
n_ba [label="%s"] ;
n_bb [label="%s"] ;
n -> n_a [label="%s"];
n -> n_b [label="%s"];
n_a -> n_aa [label="%s"];
n_a -> n_ab [label="%s"];
n_b -> n_ba [label="%s"] ;
n_b -> n_bb [label="%s"] ;
} 
'

string_list <- list(
    digraph
    , 'All calls\\nn=100,000'
    , 'Night\\nn=20,000'
    , 'Day\\nn=80,000'
    , 'Closed\\nn=5,000'
    , 'Open\\nn=15,000'
    , 'Closed\\nn=10,000'
    , 'Open\\nn=7,000'
    , 'Night'
    , 'Day'
    , 'Closed'
    , 'Open'
    , 'Closed'
    , 'Open'
    )

# http://stackoverflow.com/questions/10341114/alternative-function-to-paste
dot_file <- do.call(sprintf, string_list)
sink('diagram.gv', split=TRUE)
cat(dot_file)
sink()
# run: dot -Tpng diagram.gv > diagram.png

This will write a file called diagram.gv:

# dot -Tpng diagram.gv > diagram.png
digraph g {
graph [rankdir="LR"]
node [shape="rectangle" style=filled color=blue fontcolor=white] ;
n [label="All calls\nn=100,000"] ;
n_a [label="Night\nn=20,000" color=red] ;
n_b [label="Day\nn=80,000"] ;
n_aa [label="Closed\nn=5,000" color=red] ;
n_ab [label="Open\nn=15,000" color=red] ;
n_ba [label="Closed\nn=10,000"] ;
n_bb [label="Open\nn=7,000"] ;
n -> n_a [label="Night"];
n -> n_b [label="Day"];
n_a -> n_aa [label="Closed"];
n_a -> n_ab [label="Open"];
n_b -> n_ba [label="Closed"] ;
n_b -> n_bb [label="Open"] ;
}

Executing dot -Tpng diagram.gv > diagram.png, I will get the following output:

2014-01-17-tree_diagram.png

To make tree diagrams quickly, edit the structure of the diagram stored in the variable digraph. Dynamic texts should be inserted using sprintf via the list string_list. For example, do all calculations and then format the results in string_list. Then the diagram could be generated very quickly and will be reproducible. Changing any data source or any calculations would not result in manually re-creating the diagram!

Amp and USB Chargers

This is a good article that explains how USB charging works. Basically, avoid cheap chargers. For any reasonably good charger, the amperage of the charger doesn’t really matter so long as it exceeds what the device requires; that is, use a charger with at least 0.5 Amp if the device requires 0.5 Amp (what the original charger uses). Thus, it’s OK to use my 2 Amp chargers on most of my mobile device so things charge faster!

Hosting multiple WordPress sites with WordPress Multisite and Domain Mapping

I use WordPress to run my blog. I recently had the need to run another site and wanted to also use WordPress as my CMS. However, I don’t want to run another installation of WordPress if I don’t have to. I followed the directions on here to create enable WordPress Multisite, and followed the directions on here to make use of a different domain name for my second site. Basically,

  • backup my WordPress database and WordPress directory first
  • enable WordPress Multisite via wp-config.php
  • disable all WordPress plugins
  • follow the install options on my WordPress site
  • re-enable my WordPress plugins
  • install the WordPress MU Domain Mapping plugin
  • continue with instructions here

Some things specific to my setup:

  • sub-domain was enabled by default from my previous WordPress installation; I had to add the CNAME *blog.mydomain.com to my Host DNS configuration, where blog.mydomain.com is my current WordPress site. I also had to add the line ServerAlias blog.mydomain.com =*.blog.domain.com to my VirtualHost setup my Apache web server configuration.
  • mapped my WordPress site 2 from site2.blog.mydomain.com to site2.mydomain.com, where site2 is the name of my new WordPress site.

Now I am able to host multiple WordPress sites from the same WordPress installation. What is the down side? I have to manage plugins and themes that are available for each of the site…

Remotely access files on your home Windows computer

Most of my home computers/servers run Linux, so accessing them remotely is quite easy via the ssh protocol. Even the Windows machines I own have an ssh server installed via Cygwin.

Now, for those not familiar with Linux, one could

  • Install freeSSHd and set up an ssh server on Windows.
  • Configure an account and have freeSSHd initiate at startup.
  • Forward port 22 on the home router to port 22 on the machine with freeSSHd (assign it a static ip from the router); or, use a different external port for safety reasons (eg, 1022 -> 22).
  • Use the portable executable of WinSCP to access your files on any other Windows computer with internet access.