Obsidian | Another Way to Extract Web Content

I often read articles or blogs on webpages and make annotations when I come across interesting content. My approach is to right-click and copy the URL pointing to that paragraph, then paste it into Obsidian and write down my thoughts. The benefit of doing this is that while recording my thoughts, I can also index back to the original text.

However, these URLs that point to specific text are often long and unreadable, for example:

http://www.qncd.com/?p=9085#:~:text=%E8%BF%99%E4%B8%AA%E5%AD%A6%E6%9C%9F%EF%BC%8C%E8%BF%99%E9%97%A8%E8%AF%BE%E7%9A%84%E8%AE%A1%E5%88%92%E6%98%AF%E8%AE%B2%E5%8D%81%E4%BA%94%E4%B8%AA%E6%97%A5%E5%B8%B8%E8%AF%8D%E6%B1%87%E3%80%82%E8%BF%98%E5%A5%BD%E8%AF%BB%E8%BF%87%E8%AE%B8%E6%99%96%E7%9A%84%E4%B8%83%E6%9C%AC%E4%B9%A6%EF%BC%8C%E6%9C%89%E4%B8%80%E7%99%BE%E5%A4%9A%E4%B8%87%E5%AD%97%E6%89%93%E5%BA%95%E2%80%94%E2%80%94

The reason for this is that Chinese characters and spaces need to be processed into readable characters for the browser. So, in order to make these long URLs shorter and more readable, a simple processing is needed. After processing, it becomes like this:

http://www.qncd.com/?p=9085#:~:text = 这个学期，这门课的计划是讲十五个日常词汇。还好读过许晖的七本书，有一百多万字打底 ——

The benefits are obvious: concise and readable, even if the original text is deleted, I still know which sentence I referenced. However, this indexing method has its flaws, for example, if the text in the original article is modified, the URL may not locate that sentence anymore. But for me, this is not important.

If you also have such a need, here are my settings, you will need:

Plugin: Templater - Used to process long URLs, requires configuration of templates
Plugin: Surfing (optional) - Used to read and excerpt articles within Obsidian without going to a separate browser

Templater Configuration#

Search or download the plugin from the community marketplace: https://github.com/SilentVoid13/Templater
Create a dedicated template folder and specify it in Templater settings (Settings/Templater/Template folder location)
Create a template file in the above folder to handle URLs (e.g., named "Escape Link") and enter the following content:

<%*
  const selection = tp.file.selection()
  const urlSource =  await tp.system.clipboard()
  const url = decodeURIComponent(urlSource)
  const title = selection && selection.length ? selection : await tp.system.prompt('Please enter a title')
  tR += `[${title ? title : url}](${url})`
%>

At this point, the setup is complete. You can select a paragraph of text in the browser article, right-click and select the URL pointing to that paragraph. After entering Obsidian, use cmd+p to bring up the command palette, select the "Templater: Open Insert Template modal" command (it is recommended to configure a shortcut key for this command for easy access), and then you will see the template "Escape Link" that was just created. Click on it and a popup window will appear. Enter the content and press enter to complete the processing.

Surfing Configuration#

If you want to read and excerpt paragraphs in Obsidian, you can install this plugin. If you don't need it, you can ignore it.

Search or download the plugin from the community marketplace: https://github.com/PKM-er/Obsidian-Surfing
Turn on the switch "Customize the format of highlighted links"
In the next menu "Copy the format of highlighted links", delete the default settings and enter: {URL}

Now, when you open a webpage in Obsidian, select a piece of text, right-click, and you will see the menu "Copy the format of highlighted links". Click on it and you will get the URL pointing to that piece of text. Process it according to the previous instructions.

(Note: This article is particularly grateful for the support of the Boninall plugin and the template support of the Rice Rat)