<h1 id="regex-for-capturing-all-the-urls-in-a-paragraph-except-for-a-specific-domain"><a aria-hidden="true" class="anchor-heading icon-link" href="#regex-for-capturing-all-the-urls-in-a-paragraph-except-for-a-specific-domain"></a>Regex for capturing all the urls in a paragraph except for a specific domain</h1>
<blockquote>
See <a href="https://stackoverflow.com/a/72392318/6456163">here</a> for the original answer.
</blockquote>
This should be sufficient for your use case:
<pre class="language-regex"><code class="language-regex">/(?&#x3C;!\S)(?:https?:\/\/)?(?:(?:(?!example)\w+[.-])+[a-z]{2,11})(?!\S)/gi
</code></pre>
See <a href="https://regex101.com/r/NdOxKt/1">here</a> for a demonstration of the regex at work. Below is a rough explanation of what the regex is doing:
<ul>
<li>The leading and trailing <code>(?&#x3C;!\S)</code> essentially splits the string into segments on space characters, including whitespace and newlines</li>
<li>The <code>?:</code> syntax makes each set of parenthesis it is in a non-capture group, saving memory on the machine where it is ran and speeding up your execution time</li>
<li><code>(?:https?:\/\/)?</code> optionally matches both <code>http</code> and <code>https</code> for URLs without matching the invalid characters <code>:</code> and <code>/</code> anywhere else in the URL</li>
<li><code>(?:(?!example)\w+[.-])+</code> looks for one or more words that do not match <code>example</code>, followed by either a hyphen or a period</li>
<li><code>[a-z]{2,11}</code> matches the final domain extension, i.e. <code>com</code>, <code>org</code>, or <code>enterprises</code></li>
</ul>
<hr>
<h2 id="tags"><a aria-hidden="true" class="anchor-heading icon-link" href="#tags"></a>Tags</h2>
<ol>
<li><a title="Private" href="https://wiki.dendron.so/notes/hfyvYGJZQiUwQaaxQO27q.html" target="_blank" class="private">regex (Private)</a></li>
<li><a title="Private" href="https://wiki.dendron.so/notes/hfyvYGJZQiUwQaaxQO27q.html" target="_blank" class="private">stack-overflow (Private)</a></li>
<li><a title="Private" href="https://wiki.dendron.so/notes/hfyvYGJZQiUwQaaxQO27q.html" target="_blank" class="private">answer (Private)</a></li>
</ol>

Regex for capturing all the urls in a paragraph except for a specific domain

bits-and-bobbles


# Welcome to bits-and-bobbles

This is the root of my project, an assortment of fragments of code that I have written in the past or used frequently enough that I think they are worth sharing. Please feel free to use them as you see fit, and share and leave a star if you find them useful!

## What's here?

- [[Stack Overflow|stack-overflow]]
- [[Unix & Linux Stack Exchange|unix-stack-exchange]]
- [[Codewars|codewars]]
- [[Codepens|codepen]]
- [[Euler Project|euler]]
- [[Google FooBar|foobar]]
- [[HackTheBox Ops Notes|hackthebox]]
- [[Medium Articles|medium]]
- [[CyberSoc|cybersoc]]
- [[Cyber Institute|cyberinstitute]]