Overanalyzing
Sometimes the analytical phase of investigation produces useless information about the system.
source: Scagnetti et al. (2008): Reshaping Communication Design Tools
Highlighting a paragraph in LaTeX #2
In my last post regarding this topic I couldn't really find a solution to highlight the background of a a paragraph in LaTeX when it contained block-level elements like citations.
What actually does work exactly as I wanted it to in the first place is using the todonotes package:
\usepackage[bordercolor=white,backgroundcolor=gray!30,linecolor=black,colorinlistoftodos]{todonotes}
\newcommand{\rework}[1]{\todo[color=yellow,inline]{Rework: #1}}
Now marking up a paragraph in \rework{paragraph(s)} results in:
edit: this throws yet another an error
! TeX capacity exceeded, sorry [input stack size=1500].
when the paragraph contains a footnote. Oh well - will have to do for now anyway.
Highlighting a paragraph in LaTeX
edit: see Highlighting a paragraph in LaTeX #2 for a better solution to this.
When writing a text in Microsoft Word I like to highlight sections with a "neon marker" effect to have a clear visual guidance which parts still need some reviewing or for example need additional citations added. So basically this is how it should look in a LaTex generated PDF as well:
Googling for latex highlight paragraph blog posts like e.g. devdaily.com suggest the following approach to highlight a paragraph:
Add the following to your preamble:
\usepackage{color} \newcommand{\hilight}[1]{\colorbox{yellow}{#1}}To highlight text in the body of your document, use
\hilight{this is some highlighted text}There is a problem with the previous method is that the color box does not wrap properly. Here's a much easier and robust way of doing this
\usepackage{soul}To highlight text in the body of your document, use
\hl{this is some highlighted text}
Unfortunately this approach works only if the highlighted paragraph contains no other nested block elements like a \citep tag. If you, like me, use such elements in the text you want to highlight you will get error messages which makes the Soul package quite unusable for the original purpose. Unfortunately I cant offer you a direct solution for this problem, but at least in my case highlighting the text itself (instead of marking up the background of the text) did its job.
For this use Preamble:
\usepackage[usenames,dvipsnames]{color}
\newcommand{\markup}[1]{{\color{Cerulean}{#1}}}
In the text: \markup{whatever text including \citep[p. 5f]{Brügmann.2010} lorem ipsum bla bla}
which results in:
If this causes an error of type option clash for package color you can comment out the /usepackage[][... line and simply include the dvipsnames switch in your global \documentclass options e.g.:
\documentclass[a4paper,oneside,dvipsnames]{book}
Finally for changing the color style in the /markup command see WikiBooks:LaTeX/Colors.
Package natbib error: Bibliography not compatible with author year citation
LaTeX error message upon build:
Natbibbibliography not compatible with author year citation
In my case this was caused by a missing year entry in a bib-entry in Mendeley. This caused something along these lines in bibliography.aux (important is the last line missing a year column):
\bibcite{Wurman.2000}{{27}{2000}{{Wurman}}{{}}}
\bibcite{Wurman2001}{{28}{2001}{{Wurman et~al.}}{{}}}
\bibcite{Xerox2008}{{29}{{Xerox Corporation}}{{}}}
- add the missing year to whatever bib entry is missing that datum (either directly in the .bib file or indirectly in jabref or e.g. mendeley)
- clear your bibliography.aux file
- rebuild
Exporting Mediawiki Articles into Trac Wiki Pages
The problem doesnt seem to be exactly new, but apparently not as common as one might think either: You want to migrate your MediaWiki wiki pages into a (existing) Trac installation.
The following is the indirect path to the solution with explanations on how things work. for the direct solution go to section Direct Solution.
Indirect Solution
The TracHelpPage says regarding this issue:
You can use the attachment:wiki:TracWiki:mediawiki2trac.py Download script as a starting point. WikiProcessor for the MediaWiki styles has been started as trac plugin on http://trac-hacks.org/wiki/MediaWikiPluginMacro.
You are supposed to download the mentioned mediawiki2trac.py from the above link, copy it into a directory on your (web)server and execute it with e.g.:
python /PATH/mw2tw.py > mwexport-mediawiki2trac.sql
Unfortunately the script does only some minor importing without regard to versions, attachments (i.e. associated files) and redirects etc.
A bit more interesting is the following Trac ticket: http://trac.edgewall.org/ticket/5241 especially regarding the different handling of attachment/media-files in MediaWiki and TracWiki. MediaWiki handles attached files in a completely different way than Trac does:
In Trac, you have attached files that are associated with a given page. Whereas in MediaWiki you have files that are uploaded and it's up to the wiki-editors to create links to the uploaded documents.
So, the process for dealing with "attachments" depends on a couple of factors.
If you want your downloaded documents to be unique in Trac like they are in mediawiki, then that's a problem. AFAIK, uniqueness in mediawiki is based on the filename whereas in Trac it's a combination of filename and what Wiki page the attachment is associated with.
So if you have an attachment that is linked in multiple places from on your MediaWiki page, then to have the same effect you'd either have to set up an independent web for these attachments (so all things would link to the same file) or you need to give up on the notion that the file is unique and that you have attachments that represent copies of the file in question.
So - whats wrong here:
- First we don't want redirect pages to create stub-pages in Trac with no redirect functionality.
- Secondly existing media-references should be kept intact.
As I am not really proficient with Python I reimplemented the migration script with PHP:
Features
- Import directly from the MediaWiki MySql database into the Trac Sqlite3 database file
- Convert basic MediaWiki markup to Trac markup (so no separate MediaWiki parser for trac is needed)
- Optionally remove category entries from wiki pages
- Preserving Image/Media links. This assumes you manually copy the old mediawiki attachment files into a new flattened out directory (without the hashed subdirectories)
- Optionally migrate revision comments
- Optionally keep redirect-links intact. Even though Trac doesn't know redirects as MediaWiki does the script will change links to redirect pages so the link directly points to the real page
- Optionally discard Stub pages
- Optionally discard pages with fewer than X bytes
- Remove custom specified substrings from pages (e.g. old MediaWiki templates)
- Replace custom specified substrings with replacement-strings (e.g. custom copyright notices)
Limitations
There are some limitations though:
- Complex markup (e.g. tables) will not be converted
- As the script directly accesses the Trac Sqlite3 database file the corresponding Sqlite3 extension needs to be enabled in PHP
- I personally didnt need this feature, so the script discards old revisions - if you take a look at the old python version of this script it shouldnt be hard to implement this though
Direct Solution
- Backup your trac database file - seriously - do it!
- Download and unzip the following script into a web-accessible directory on your webserver: mediawiki2trac.zip
- Edit the settings section at the start of the script - comments should hopefully be self-explanatory.
- Call mediawiki2trac.php in a web browser.
- You need Sqlite3 support in PHP for the script to work. If you get an error message like Fatal error: Class 'SQLite3' not found in.... you are missing the extension. See for example http://ubuntuforums.org/showthread.php?t=891767 on how to install this on Ubuntu.
- When calling the script it first starts in test-mode showing you what replacements etc would be done. Activate live processing by clicking on the given link.
Going Sans-Serif in LaTeX
While I am quite a fan of LaTeX I often had - when looking at LaTeX created documents - the feeling of the general look to be too "latexy". Actually similar to the feeling MS Office 2003 gave with Arial and Office 2007 gives with the omnipresent use of Calibri.
So - if you aren't a die-hard Serif-only evangelist I can recommend the Kepler fonts packaged in kpfonts
\usepackage[largesmallcaps,nofligatures]{kpfonts}
\renewcommand*\familydefault{\sfdefault}
\usepackage{sectsty}
\allsectionsfont{\textothersc}
Result:
Using Mendeley to manage BibTex References and Citations
Mendeley Desktop organizes your research paper collection and citations. It automatically extracts references from documents, generates bibliographies, and is freely available on Windows, Mac OS X and Linux.
Mendeley Web lets you access your research paper library from anywhere, share documents in closed groups, and collaborate on research projects online. It connects you to like-minded academics and puts the latest research trend statistics at your fingertips.
Research papers collected using Mendeley
XNA: The specified Module could not be found
During development of the ConSense Cockpit solution? I experienced (see 348) a problem when deploying a Windows Presentation Foundation (WPF) solution on a target client with no XNA (btw - an extremely lame acronym from Microsoft) installed which resulted in:
The specified module could not be found. (Exception from HRESULT: 0x8007007E)
For me the trick was to include the libraries:
in the solution project and have them copied in a post-build event to the deployment folder (/installDir).
Due to the surely incoming copyright and litigation infringements I hope you understand I just linked the respective google queries instead of uploading the dll's myself ;-(
Trac and Google
No idea why, but I couldn't get custom files in Trac's /htdocs/ folder to be accessible using e.g.
<Location "/robots.txt">
SetHandler None
</Location>
This caused some headaches on how to make this Trac installation / Blog accessible to Google. In the end three plugins helped to
- make Google see my robots.txt: RobotsTxtPlugin
- verify the site in Google Webmaster Central: GoogleWebmasterVerifyPlugin
- enable Google Analytics: TracGoogleAnalytics
The usual
sudo easy_install svn-path-to-the-plugin
and restarting apache was all to get those 3 working.
Extracting Metadata from Office 2007 XML Documents
Apparently you can use DSOfile for Open XML documents, but at least for me there were problems in x64 environments - see Stackoverflow: Unable to read Office 2007 doc props using x86 dsofile.dll on x64 system:
In this 64-bit environment, DSOFile.dll can successfully read properties from Office 2003 documents (eg. DOC), but in the case of Office 2007 documents (eg. DOCX), only empty strings are returned for all properties, or else an error is generated.
So - use the Open XML SDK 2.0 for Microsoft Office. After downloading and referencing the dll:
using DocumentFormat.OpenXml.Packaging;
WordprocessingDocument document = WordprocessingDocument.Open(filePath, false);
PackageProperties standardProperties = document.PackageProperties;
CoreFilePropertiesPart fileProperties = document.CoreFilePropertiesPart;
ExtendedFilePropertiesPart extendedPropertiesContainer = document.ExtendedFilePropertiesPart;
if (extendedPropertiesContainer != null)
{
var extendedProperties = extendedPropertiesContainer.Properties;
}
var customPropertiesContainer = document.CustomFilePropertiesPart;
if (customPropertiesContainer != null)
{
var customProperties = customPropertiesContainer.Properties;
}
Upgrading TeamCity - Catalina_Home and Basedir Issues
On Ubuntu:
The BASEDIR environment variable is not defined correctly. This environment variable is needed to run this program
Dont fall into the same trap as me and remember to give executive permissions to the .sh and .jar files in TeamCity/bin after uploading the new TeamCity files from Windows.
By the way - to include a build-status of your Teamcity installation in trac create a file TeamCityPlugin.py in your trac/plugins folder:
from trac.wiki.macros import WikiMacroBase
import urllib
class TeamCityStatusMacro(WikiMacroBase):
"""Inserts the current build status into the wiki page."""
def expand_macro(self, formatter, name, args):
res=urllib.urlopen('Http://path-to-teamcity:8111/externalStatus.html').read()
return res
Reusing Subversion-managed Library Projects in Visual Studio
Just posting this as it would have saved me some time myself back then:
In the ConSense Prototype I have some custom libraries (especially the proprietary RDF-framework) which are used in multiple application-solutions (WPF Cockpit, VSTO Plugins, Windows/WCF Service, Plugins for those Service etc..).
So - even if you use Subversion this still leaves the question open how to keep the libraries in sync, assuming they are included in multiple Visual Studio solutions and potentially simultaneous changes before a check-in can occur.
The solution for me - and I hope you allow some advertisement for a commercial software here - was the combination of svn:externals and the VisualSVN Visual Studio plugin.
The libraries are kept in a separate directory in subversion and referenced as an external in the "main" application's subversion properties:
Select the application directory in subversion (Tortoise -> Repo-browser) and in the properties section (right click on the directory and select properties) add a new key "svn:externals". The value looks something like
Externals/ConSense.Generic.Rdf http://subversion.consense-project.com/2_Generic/ConSense.Generic.Rdf
with the first part being the directory the external sources will be included in in your application solution upon checkout and the second being the svn-path to the external. When you now checkout the application using VisualSVN the library source files are automatically checked out as well in the (above screenshot: /Externals/) directory you specified as first parameter in the properties.
The only caveat is that the global checkout-button of VisualSVN does only checkout the main solution, not the included externals so additional manual checkouts (right click on the external-project --> subversion --> checkout)are needed.
Resolving RDF Relationship-targets at Runtime
From Jeni Tennison: On Resolvability:
There are three ways of locating the documentation about a particular property or class:
1) looking through the general documentation the data publisher has provided
2) resolving the URI of the class or property
3) searching
Using URIs for classes and properties provides a mechanism for applications to get hold of this extra information about unknown vocabularies. They might try four tactics, in order of priority:
1) look at the data they already know; the information they need about the unknown properties and classes may be included in the files they’ve already accessed (including those containing data)
2) look in an application-specific (possibly cloud-hosted) cache of vocabularies that the application has already downloaded
3) resolve the URI of the class or property by performing an HTTP GET (and add it to the application-specific cache)
4) look in a general-purpose cache, such as the Internet Archive or an ontology repository such as Swoogle
Generic API's:
- RKB Explorer
http://www.rkbexplorer.com/
Example
- DBpedia
http://lookup.dbpedia.org
example
- Sindice
http://sindice.com
example
- sameas.org
http://sameas.org
example
SPARQL Endpoints:
- http://esw.w3.org/topic/SparqlEndpoints
- DBLP Bibliography Database published through D2R Server (Freie Universität Berlin) : http://www4.wiwiss.fu-berlin.de/dblp/sparql
- Linked Movie Data Base : http://www.linkedmdb.org/
edit:
strange - a Sindice Api call returns an XML resultset alright, but the contained URI of that query itself (seems to be ment as a self-reference) leads nowhere. See for example:
- A call to http://api.sindice.com/v2/search?q=Hinnerk+Br%FCgmann&qt=term&page=1
- contains as Uri (note the additional /api/): http://api.sindice.com/api/v2/search?amp%3Bqt=term&page=1&q=+Hinnerk+Br%FCgmann
Open Research in Information Systems
I am constantly frustrated that the EU or whoever is so kind as to sponsor research activities does not force the respective institutes to open up the underlying program code of whatever prototypes or tests were developed/conducted as part of their funding obligations.
Rigorous empirical/behavioral research is fine and gay, but the more practical advances in our field require an enormous amount of coding hours. So while I am able to read quite interesting and clever papers on previous research activities the most basic of wheels regarding prototype development have to be reinvented over and over. Even if scientific publications actually reveal the core algorithms used in a modern (rich) client prototype the complexity of such applications makes rebuilding that piece of software (which is often necessary to actually "go further") far beyond trivial.
Add the fact that quite a bit of the baseline research in IS is performed in a small-group or even individual-closed-closet type of environment and also that those individuals have their own agenda of not spending 12 years on a ph.d. and you end up with an auto-selected choice of individual research scope which is accomplishable using existing open-source libraries/platforms (and therefore far smaller than desirable). This leads to the question of who is helping whom here - shouldn't the open source community benefit from publicly tax-funded research instead of vice versa?
I wont even get started about hype-topics appearing in blogs 1 year before they do in Gartner Hypecycles and 2 years before they do on research conferences, but please please - allow us dwarves to step on the shoulders of other dwarves to accomplish more gigantic tasks.
edit: this rant was kind of spawned by my frustration that Microsoft Research lets their activity logger PersonalVibe2 (which they kindly provided - in a compiled form - to the public) rot unusable in Win7/64bit systems even if only a modification of supposedly 5 minutes would be required to make it work again. Of course as a private company MS is free do as they wish and not the real target of the above paragraphs. Still: facepalm.
WCF inside Windows Service: HTTP could not register URL http://+:8000
Following the usual-subject kind of MSDN tutorials to make a Windows Service able to communicate with a WPF application (that is have a WCF service hosted inside your Windows Service) I ended with this kind of app.config section:
<services> <service name="ConSense.Sensors.Service.ConSenseSensorsWcfService" behaviorConfiguration="ConSenseSensorsWcfServiceBehavior"> <host> <baseAddresses> <add baseAddress="http://localhost:8000/ConSense/service"/> </baseAddresses> </host> <!-- this endpoint is exposed at the base address provided by host: http://localhost:8000/ConSense/service --> <endpoint address="" binding="wsHttpBinding" contract="ConSense.Sensors.Service.IConSenseSensorsWcfService" /> <!-- the mex endpoint is explosed at http://localhost:8000/ConSense/service/mex --> <endpoint address="mex" binding="mexHttpBinding" contract="IMetadataExchange" /> </service> </services>
In a UAC environment (Vista/Win7) this leads to the error message shown in the title of this post: HTTP could not register URL http://+:8000 -meaning your LocalService account is not allowed to open such an endpoint.
Maybe due to SOA hype or whatever every tutorial and helpfile around seems to assume you want to build a supercool SOAP Business Service exposed over the internet the moment you set eyes on WCF. This leads to the top 10 Google solutions regarding this exception to suggest you should just run Visual Studio with admin privileges and be done with it. The better answer (at least for my scenario) was to skip the http-stuff and use a named pipe as a WCF endpoint instead:
<services> <service name="ConSense.Sensors.Service.ConSenseSensorsWcfService" behaviorConfiguration="ConSenseSensorsWcfServiceBehavior"> <host> <baseAddresses> <add baseAddress="net.pipe://localhost/ConSenseSensorsService"/> </baseAddresses> </host> <!-- this endpoint is exposed at the base address provided by host: net.pipe://localhost/ConSenseSensorsService --> <endpoint address="net.pipe://localhost/ConSenseSensorsService" binding="netNamedPipeBinding" bindingConfiguration="netNamedPipeBindingUnsecure" contract="ConSense.Sensors.Service.IConSenseSensorsWcfService" /> <!-- this mex endpoint is exposed at the base address provided by host: net.pipe://localhost/ConSenseSensorsService/mex --> <endpoint address="mex" binding="mexNamedPipeBinding" contract="IMetadataExchange" /> </service> </services> <bindings> <netNamedPipeBinding> <binding name="netNamedPipeBindingUnsecure" > <security mode = "None"> </security> </binding > </netNamedPipeBinding> </bindings> <!--For debugging purposes set the includeExceptionDetailInFaults attribute to true--> <behaviors> <serviceBehaviors> <behavior name="ConSenseSensorsWcfServiceBehavior"> <serviceMetadata /> <serviceDebug includeExceptionDetailInFaults="True" /> </behavior> </serviceBehaviors> </behaviors>
Debugging a Windows Service
This may be blatantly obvious for most developers, but just to save someone the time to figure this out by himself: To attach the debugger to your process put some sleep into the service's Main method so you got enough time to attach before whatever may be causing an exception
System.Threading.Thread.Sleep(10000);
Custom CSS for Trac
I really love Trac - dont get me wrong - but red just isnt my color.
So I made a couple halfhearted moves in the past to style my Trac installation (the one you are looking at atm) a bit but always had a deja vú when stumbling across Trac: Customizing the Trac Interface as it - in my understanding - ment I had to learn the templating engine used by Trac (Genshi). Being a Python-noob myself investing a lot of time digging through sources wasn't exactly desirable either.
Luckily Trac-Hacks.org hosts a plugin called ThemeEnginePlugin which allows replacing the default Trac theme with custom theme-packs. There are a couple pre-prepared theme-packs available at Trac-Hacks:Themes but half of them look broken (at Trac 0.11) and actually I was quite satisfied to find that the ThemeEnginePlugin allows you to append custom CSS to every Trac-page letting you overwrite whatever suits your needs in the default CSS definitions. Just remember to update your trac.ini (and restart apache afterwards) with
[theme] enable_css = true
Thanks Noah Kantrowitz aka coderanger!
Ah yes - just as it took me a while to figure this out back then: inserting arbitrary html in Trac wikipages can be done with
{{{
#!html
HTML HERE
}}}
Log4Net in a Windows Service
Requirement: I want a Windows Service
- under LocalService (and not LocalSystem see Microsoft MSDN: Secure Hosting and Deployment of WCF Services )
- using Log4Net starting when the service's Main() method is called - so I can see if the service starts up at all
Seems easy, but wut - no logfile created!
The help at Apache FAQ: Why doesn't the logging in my service work? is fine, but maybe I just skipped over it - it took me a while to figure out that the problem of log4net not showing any results (meaning: writing out a logfile) was related to the windows service not being able to find my application.config file (cause the service was called from a different location than my output dir).
The Apache FAQ suggests using AppDomain.BaseDirectory which unfortunately fails when I want to initialize the logger in the Services Main() method due to Main being static. Copying the app.config to lets say c: and using
log4net.Config.XmlConfigurator.Configure(new FileInfo("c:\\app.config"));
works, but isnt exactly what we want - at least it should use some temporary or application-specific dir in %appdata%
edit: time to bang head against something again - a failure in the configuration file caused the problem.
This line allowed logging to a client-specific directory:
<file type="log4net.Util.PatternString" value="%env{APPDATA}\\ConSense\\Logs\\ConSense.Sensors.Service.ConsoleStart.Log.txt" />
Btw - a very nice introduction to Log4Net:
ConSense Project
Management of unstructured Information using semantic Metadata
The amount of unstructured electronic documents in enterprise environments is growing rapidly. The ConSense project aims to assist the enterprise wide lifecycle management of electronic documents by utilizing the context of document access by knowledge workers. From this context data one can deduct semantic relations among documents and business-domain specific entities which can be combined into a semantic network. Querying the resulting network allows for the discovery and reuse of unstructured documents.
- Actual Business Domain
- Knowledge workers use, create and collaborate with documents in business processes.
- Context Sensors
- Client-side software plugins track the context of business-relevant document usage and submit it in condensed form to the central semantic virtualization-store.
- Semantic Virtualization
- The context information is analyzed using heuristic business-rules resulting in semantic relationships among documents, persons, products, processes and services within the domain.
- Task-specific Information
- The semantic relationships are used to proactively supply knowledge workers with information and documents related to their actual task-context.
ConSense is a project of the Department for Information Systems II (Service and Process Management - Prof. Bodendorf)
University of Erlangen-Nürnberg
Project contact: Hinnerk Bruegmann
Student Participation - Master Thesis
- Sauer, Dieter: Automatic Detection of HCI Action Clusters and corresponding Workflow Patterns (Automatische Erkennung von Aktionsclustern und deren Zuordnung zu Workflows)
- Sauer, Dieter: Using Metaanalysis to extract contextual Information from the local File System
- Heckl, Christian: Interaktive Informationsvisualisierung im "Document Lifecycle Management"
- Steininger, Oliver: Erkennung von E-Mail Konversationsfäden (Recognizing Conversation Threads in Email Communication)
Student Participation - Term Papers
- Heckl, Christian; Sauer, Dieter: Unterstützung der Analyse von Informationszusammenhängen durch selektive Visualisierung (Supporting the Analysis of interconnected Information by selective Visualization)
.
rss










